Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update the prompt to include needs-info #296

Merged
merged 5 commits into from
Sep 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 7 additions & 4 deletions pkgs/sdk_triage_bot/lib/src/common.dart
Original file line number Diff line number Diff line change
Expand Up @@ -37,11 +37,14 @@ String get geminiKey {
return token;
}

/// Don't return more than 5k of text for an issue body.
String trimmedBody(String body) {
const textLimit = 5 * 1024;
/// Maximal length of body used for querying.
const bodyLengthLimit = 10 * 1024;

return body.length > textLimit ? body = body.substring(0, textLimit) : body;
/// The [body], truncated if larger than [bodyLengthLimit].
String trimmedBody(String body) {
return body.length > bodyLengthLimit
? body = body.substring(0, bodyLengthLimit)
: body;
}

class Logger {
Expand Down
9 changes: 5 additions & 4 deletions pkgs/sdk_triage_bot/lib/src/gemini.dart
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ import 'package:google_generative_ai/google_generative_ai.dart';
import 'package:http/http.dart' as http;

class GeminiService {
// gemini-1.5-pro-latest, gemini-1.5-flash-latest, gemini-1.0-pro-latest
// Possible values for models: gemini-1.5-pro-latest, gemini-1.5-flash-latest,
// gemini-1.0-pro-latest, gemini-1.5-flash-exp-0827.
static const String classificationModel = 'models/gemini-1.5-flash-latest';
static const String summarizationModel = 'models/gemini-1.5-flash-latest';

Expand All @@ -33,17 +34,17 @@ class GeminiService {

/// Call the summarize model with the given prompt.
///
/// On failures, this will throw a `GenerativeAIException`.
/// On failures, this will throw a [GenerativeAIException].
Future<String> summarize(String prompt) {
return _query(_summarizeModel, prompt);
}

/// Call the classify model with the given prompt.
///
/// On failures, this will throw a `GenerativeAIException`.
/// On failures, this will throw a [GenerativeAIException].
Future<List<String>> classify(String prompt) async {
final result = await _query(_classifyModel, prompt);
final labels = result.split(',').map((l) => l.trim()).toList();
final labels = result.split(',').map((l) => l.trim()).toList()..sort();
mosuem marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How big do we expect the result to be? (Or rather, how many commas?)

This can do an extra string allocation per part, if the result text has space after commas.

Doing

  final labels = result.trim.split(_trimCommaRE).toList()..sort();
...

final _trimCommaRE = RegExp(r'\s*,\s*');

will still trim the outer string (but it's more likely to not have leading and trailing spaces, than commas are to not have surrounding space), and then remove spaces around commas while splitting.

(The sum of many small inefficiencies is a medium inefficiency that it's hard to find the source of. But ... this uses a RegExp, so now we have two problems 😉.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:)

Thanks for the feedback; this will generally have either 0 or 1 commas. I think I'd prefer to avoid a regex for this bit of code.

return labels;
}

Expand Down
128 changes: 119 additions & 9 deletions pkgs/sdk_triage_bot/lib/src/prompts.dart
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.

// TODO(devoncarew): Add additional prompt instructions for `area-pkg` issues.

String assignAreaPrompt({
required String title,
required String body,
Expand Down Expand Up @@ -45,13 +47,110 @@ If the issue is clearly a bug report, then also apply the label 'type-bug'.
If the issue is mostly a question, then also apply the label 'type-question'.
Otherwise don't apply a 'type-' label.

If the issue title starts with "[breaking change]" it was likely created using
existing issue template; do not assign an area label. IMPORTANT: only do this if
the issue title starts with "[breaking change]".

If the issue was largely unchanged from our default issue template, then apply
the 'needs-info' label and don't assign an area label. These issues will
generally have a title of "Create an issue" and the body will start with "Thank
you for taking the time to file an issue!".

If the issue title is "Analyzer Feedback from IntelliJ", these are generally not
well qualified. For these issues, apply the 'needs-info' label but don't assign
an area label.

Return the labels as comma separated text.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Returns"

(A section starting with "Returns" is the recommended way to describe the returned value, just like a section starting with "The [foo] value ..." referencing a parameter is the recommended way to document that parameter.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a directive to the LLM about what it should do, not describing what it will do. As in, "You should return the ...".


Issue follows:
Here are a series of few-shot examples:

<EXAMPLE>
INPUT: title: Create an issue

body: Thank you for taking the time to file an issue!

This tracker is for issues related to:

Dart analyzer and linter
Dart core libraries (dart:async, dart:io, etc.)
Dart native and web compilers
Dart VM

OUTPUT: needs-info
</EXAMPLE>

<EXAMPLE>
INPUT: title: Analyzer Feedback from IntelliJ

body: ## Version information

- `IDEA AI-202.7660.26.42.7351085`
- `3.4.4`
- `AI-202.7660.26.42.7351085, JRE 11.0.8+10-b944.6842174x64 JetBrains s.r.o, OS Windows 10(amd64) v10.0 , screens 1600x900`

OUTPUT: needs-info
</EXAMPLE>

<EXAMPLE>
INPUT: title: Support likely() and unlikely() hints for AOT code optimization

body: ```dart
// Tell the compiler which branches are going to be taken most of the time.

if (unlikely(n == 0)) {
// This branch is known to be taken rarely.
} else {
// This branch is expected to be in the hot path.
}

final result = likely(s == null) ? commonPath() : notTakenOften();
```

Please add support for the `likely()` and `unlikely()` optimization hints within branching conditions. The AOT compiler can use these hints to generate faster code in a hot path that contains multiple branches.

$title
OUTPUT: area-vm, type-enhancement, type-performance
</EXAMPLE>

$body
<EXAMPLE>
INPUT: title: Analyzer doesn't notice incorrect return type of generic method

body: dart analyze gives no errors on the follow code:

```dart
void main() {
method(getB());
}

void method(String b) => print(b);

B getB<B extends A>() {
return A() as B;
}

class A {}
```
I would have suspected it to say something along the line of **The argument type 'A' can't be assigned to the parameter type 'String'.**

OUTPUT: area-analyzer, type-enhancement
</EXAMPLE>

<EXAMPLE>
INPUT: title: DDC async function stepping improvements

body: Tracking issue to monitor progress on improving debugger stepping through async function bodies.

The new DDC async semantics expand async function bodies into complex state machines. The normal JS stepping semantics don't map cleanly to steps through Dart code given this lowering. There are a couple potential approaches to fix this:
1) Add more logic to the Dart debugger to perform custom stepping behavior when stepping through async code.
2) Modify the async lowering in such a way that stepping more closely resembles stepping through Dart. For example, rather than returning multiple times, the state machine function might be able to yield. Stepping over a yield might allow the debugger to stay within the function body.

OUTPUT: area-web
</EXAMPLE>

The issue to triage follows:

title: $title

body: $body

${lastComment ?? ''}'''
.trim();
Expand All @@ -60,15 +159,26 @@ ${lastComment ?? ''}'''
String summarizeIssuePrompt({
required String title,
required String body,
required bool needsInfo,
}) {
const needsMoreInfo = '''
Our classification model determined that we'll need more information to triage
this issue. Thank them for their contribution and gently prompt them to provide
more information.
''';

final responseLimit = needsInfo ? '' : ' (1-2 sentences, 24 words or less)';

return '''
You are a software engineer on the Dart team at Google. You are responsible for
triaging incoming issues from users. For each issue, briefly summarize the issue
(1-2 sentences, 24 words or less).
You are a software engineer on the Dart team at Google.
You are responsible for triaging incoming issues from users.
For each issue, briefly summarize the issue $responseLimit.

${needsInfo ? needsMoreInfo : ''}

Issue follows:
The issue to triage follows:

$title
title: $title

$body''';
body: $body''';
}
54 changes: 34 additions & 20 deletions pkgs/sdk_triage_bot/lib/triage.dart
Original file line number Diff line number Diff line change
Expand Up @@ -70,37 +70,45 @@ ${trimmedBody(comment.body ?? '')}
}
}

// ask for the summary
var bodyTrimmed = trimmedBody(issue.body);
String summary;

// ask for the 'area-' classification
List<String> newLabels;
try {
summary = await geminiService.summarize(
summarizeIssuePrompt(title: issue.title, body: bodyTrimmed),
newLabels = await geminiService.classify(
assignAreaPrompt(
title: issue.title,
body: bodyTrimmed,
lastComment: lastComment,
),
);
} on GenerativeAIException catch (e) {
// Failures here can include things like gemini safety issues, ...
stderr.writeln('gemini: $e');
exit(1);
}

logger.log('## gemini summary');
logger.log('');
logger.log(summary);
logger.log('');

// ask for the 'area-' classification
List<String> newLabels;
// ask for the summary
String summary;
try {
newLabels = await geminiService.classify(
assignAreaPrompt(
title: issue.title, body: bodyTrimmed, lastComment: lastComment),
summary = await geminiService.summarize(
summarizeIssuePrompt(
title: issue.title,
body: bodyTrimmed,
needsInfo: newLabels.contains('needs-info'),
),
);
} on GenerativeAIException catch (e) {
// Failures here can include things like gemini safety issues, ...
stderr.writeln('gemini: $e');
exit(1);
}

logger.log('## gemini summary');
logger.log('');
logger.log(summary);
logger.log('');

logger.log('## gemini classification');
logger.log('');
logger.log(newLabels.toString());
Expand All @@ -123,9 +131,9 @@ ${trimmedBody(comment.body ?? '')}
// create github comment
await githubService.createComment(sdkSlug, issueNumber, comment);

final allRepoLabels = (await githubService.getAllLabels(sdkSlug)).toSet();
final labelAdditions = newLabels.toSet().intersection(allRepoLabels).toList()
..sort();
final allRepoLabels = await githubService.getAllLabels(sdkSlug);
final labelAdditions =
filterLegalLabels(newLabels, allRepoLabels: allRepoLabels);
if (labelAdditions.isNotEmpty) {
labelAdditions.add('triage-automation');
}
Expand All @@ -141,7 +149,13 @@ ${trimmedBody(comment.body ?? '')}
logger.log('Triaged ${issue.htmlUrl}');
}

List<String> filterExistingLabels(
List<String> allLabels, List<String> newLabels) {
return newLabels.toSet().intersection(allLabels.toSet()).toList();
List<String> filterLegalLabels(
List<String> labels, {
required List<String> allRepoLabels,
}) {
final validLabels = allRepoLabels.toSet();
return [
for (var label in labels)
if (validLabels.contains(label)) label,
]..sort();
}
29 changes: 19 additions & 10 deletions pkgs/sdk_triage_bot/tool/bench.dart
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ void main(List<String> args) async {
await githubService.fetchIssue(sdkSlug, expectation.issueNumber);
final bodyTrimmed = trimmedBody(issue.body);

print('#${issue.number}');
print('#${issue.number}: ${expectation.expectedLabels.join(', ')}');

try {
final labels = await geminiService.classify(
Expand All @@ -60,12 +60,7 @@ void main(List<String> args) async {
if (expectation.satisfiedBy(labels)) {
predicted++;
} else {
var title = issue.title.length > 100
? '${issue.title.substring(0, 100)}...'
: issue.title;
print(' "$title"');
print(' labeled: ${expectation.expectedLabels.join(', ')}');
print(' prediction: ${labels.join(', ')}');
stderr.writeln(' bot: ${labels.join(', ')}');
}
} on GenerativeAIException catch (e) {
// Failures here can include things like gemini safety issues, ...
Expand Down Expand Up @@ -114,8 +109,22 @@ class ClassificationResults {
}

bool satisfiedBy(List<String> labels) {
final filtered = labels.where((l) => !l.startsWith('type-')).toSet();
final expected = expectedLabels.where((l) => !l.startsWith('type-'));
return expected.every(filtered.contains);
// Handle a `needs-info` label.
if (expectedLabels.contains('needs-info')) {
devoncarew marked this conversation as resolved.
Show resolved Hide resolved
return labels.contains('needs-info');
}

// Handle a `breaking-change-request` label.
if (expectedLabels.contains('breaking-change-request')) {
return labels.contains('breaking-change-request');
}

for (final label in expectedLabels.where((l) => l.startsWith('area-'))) {
if (!labels.contains(label)) {
return false;
}
}

return true;
}
}
9 changes: 4 additions & 5 deletions pkgs/sdk_triage_bot/tool/bench.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ materially improve the classification performance.
| #56354 | `area-web`, `type-bug` |
| #56353 | `area-dart2wasm` |
| #56350 | `area-analyzer`, `type-enhancement` |
| #56348 | `area-intellij` |
| #56347 | `area-dart-cli`, `type-bug` |
| #56346 | `area-pkg`, `pkg-json`, `type-enhancement` |
| #56345 | `area-analyzer`, `type-enhancement` |
Expand All @@ -44,7 +43,7 @@ materially improve the classification performance.
| #56316 | `area-web` |
| #56315 | `area-web` |
| #56314 | `area-web`, `type-bug` |
| #56308 | `area-vm` |
| #56308 | `area-vm` `breaking-change-request` |
| #56306 | `area-vm`, `type-bug` |
| #56305 | `area-front-end`, `type-bug`, `type-question` |
| #56304 | `area-core-library`, `type-enhancement` |
Expand All @@ -55,13 +54,12 @@ materially improve the classification performance.
| #56283 | `area-dart2wasm` |
| #56256 | `area-front-end`, `type-bug` |
| #56254 | `area-pkg`, `pkg-vm-service`, `type-bug` |
| #56246 | `area-intellij` |
| #56240 | `area-intellij` |
| #56240 | `needs-info` |
| #56229 | `area-infrastructure` |
| #56227 | `area-native-interop` |
| #56220 | `area-infrastructure`, `type-code-health` |
| #56217 | `area-meta` |
| #56216 | `area-intellij` |
| #56216 | `needs-info` |
| #56214 | `area-native-interop` |
| #56208 | `area-google3`, `type-enhancement` |
| #56207 | `area-google3` |
Expand Down Expand Up @@ -108,3 +106,4 @@ We need more information from the user before we can triage these issues.

## Results
2024-08-27: 55.6% using gemini-1.5-flash-latest
2024-08-30: 64.8% using gemini-1.5-flash-latest
Loading