Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align Utils::suggestionList() with the reference implementation #1075

Merged
merged 13 commits into from
Mar 14, 2022
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ You can find and compare releases at the [GitHub release page](https://github.co
- Throw if `Introspection::fromSchema()` returns no data
- Reorganize abstract class `ASTValidationContext` to interface `ValidationContext`
- Reorganize AST interfaces related to schema and type extensions
- Align `Utils::suggestionList()` with the reference implementation (#1075)

### Added

Expand Down
129 changes: 129 additions & 0 deletions src/Utils/LexicalDistance.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
<?php declare(strict_types=1);

namespace GraphQL\Utils;

/**
* Computes the lexical distance between strings A and B.
*
* The "distance" between two strings is given by counting the minimum number
* of edits needed to transform string A into string B. An edit can be an
* insertion, deletion, or substitution of a single character, or a swap of two
* adjacent characters.
*
* Includes a custom alteration from Damerau-Levenshtein to treat case changes
* as a single edit which helps identify mis-cased values with an edit distance
* of 1.
*
* This distance can be useful for detecting typos in input or sorting
*
* Unlike the native levenshtein() function that always returns int, LexicalDistance::measure() returns int|null.
* It takes into account the threshold and returns null if the measured distance is bigger.
*/
class LexicalDistance
{
private string $input;

private string $inputLowerCase;

/**
* List of char codes in the input string.
*
* @var array<int>
*/
private array $inputArray;

public function __construct(string $input)
{
$this->input = $input;
$this->inputLowerCase = \strtolower($input);
$this->inputArray = self::stringToArray($this->inputLowerCase);
}

public function measure(string $option, float $threshold): ?int
{
if ($this->input === $option) {
return 0;
}

$optionLowerCase = \strtolower($option);

// Any case change counts as a single edit
if ($this->inputLowerCase === $optionLowerCase) {
return 1;
}

$a = self::stringToArray($optionLowerCase);
$b = $this->inputArray;

if (\count($a) < \count($b)) {
$tmp = $a;
$a = $b;
$b = $tmp;
}

$aLength = \count($a);
$bLength = \count($b);

if ($aLength - $bLength > $threshold) {
return null;
spawnia marked this conversation as resolved.
Show resolved Hide resolved
}

/** @var array<array<int>> $rows */
$rows = [];
for ($i = 0; $i <= $bLength; ++$i) {
$rows[0][$i] = $i;
}

for ($i = 1; $i <= $aLength; ++$i) {
$upRow = &$rows[($i - 1) % 3];
$currentRow = &$rows[$i % 3];

$smallestCell = ($currentRow[0] = $i);
for ($j = 1; $j <= $bLength; ++$j) {
$cost = $a[$i - 1] === $b[$j - 1] ? 0 : 1;

$currentCell = \min(
$upRow[$j] + 1, // delete
$currentRow[$j - 1] + 1, // insert
$upRow[$j - 1] + $cost, // substitute
);

if ($i > 1 && $j > 1 && $a[$i - 1] === $b[$j - 2] && $a[$i - 2] === $b[$j - 1]) {
// transposition
$doubleDiagonalCell = $rows[($i - 2) % 3][$j - 2];
$currentCell = \min($currentCell, $doubleDiagonalCell + 1);
}

if ($currentCell < $smallestCell) {
$smallestCell = $currentCell;
}

$currentRow[$j] = $currentCell;
}

// Early exit, since distance can't go smaller than smallest element of the previous row.
if ($smallestCell > $threshold) {
return null;
}
}

$distance = $rows[$aLength % 3][$bLength];

return $distance <= $threshold ? $distance : null;
}

/**
* Returns a list of char codes in the given string.
*
* @return array<int>
vhenzl marked this conversation as resolved.
Show resolved Hide resolved
*/
private static function stringToArray(string $str): array
{
$array = [];
foreach (\mb_str_split($str) as $char) {
$array[] = \mb_ord($char);
}

return $array;
}
}
27 changes: 10 additions & 17 deletions src/Utils/Utils.php
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@
use function array_map;
use function array_reduce;
use function array_slice;
use function asort;
use function count;
use function dechex;
use function get_class;
Expand All @@ -20,7 +19,6 @@
use function is_scalar;
use function is_string;
use function json_encode;
use function levenshtein;
use function mb_convert_encoding;
use function mb_strlen;
use function mb_substr;
Expand All @@ -31,7 +29,6 @@
use function property_exists;
use function range;
use stdClass;
use function strtolower;
use function unpack;

class Utils
Expand Down Expand Up @@ -284,34 +281,30 @@ static function ($list, $index) use ($selected, $selectedLength): string {
* Given an invalid input string and a list of valid options, returns a filtered
* list of valid options sorted based on their similarity with the input.
*
* Includes a custom alteration from Damerau-Levenshtein to treat case changes
* as a single edit which helps identify mis-cased values with an edit distance
* of 1
*
* @param array<string> $options
*
* @return array<int, string>
*/
public static function suggestionList(string $input, array $options): array
{
/** @var array<string, int> $optionsByDistance */
$optionsByDistance = [];
$lexicalDistance = new LexicalDistance($input);
$threshold = mb_strlen($input) * 0.4 + 1;
foreach ($options as $option) {
if ($input === $option) {
$distance = 0;
} else {
$distance = (strtolower($input) === strtolower($option)
? 1
: levenshtein($input, $option));
}
$distance = $lexicalDistance->measure($option, $threshold);

if ($distance <= $threshold) {
if ($distance !== null) {
$optionsByDistance[$option] = $distance;
}
}

asort($optionsByDistance);
\uksort($optionsByDistance, static function (string $a, string $b) use ($optionsByDistance) {
$distanceDiff = $optionsByDistance[$a] - $optionsByDistance[$b];

return $distanceDiff !== 0 ? $distanceDiff : \strnatcmp($a, $b);
});

return array_keys($optionsByDistance);
return array_map('strval', array_keys($optionsByDistance));
}
}
2 changes: 1 addition & 1 deletion tests/Error/ErrorTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@ public function getNodes(): ?array
self::assertEquals([1 => 2], $locatedError->getPositions());
self::assertNotNull($locatedError->getSource());

$error = new class('msg', new NullValueNode([]), null, [], ) extends Error {
$error = new class('msg', new NullValueNode([]), null, []) extends Error {
public function getNodes(): ?array
{
return [new NullValueNode([])];
Expand Down
8 changes: 4 additions & 4 deletions tests/Executor/DeferredFieldsTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -592,10 +592,10 @@ public function testDeferredChaining(): void
}
');

$author1 = ['name' => 'John'/*, 'bestFriend' => ['name' => 'Dirk']*/];
$author2 = ['name' => 'Jane'/*, 'bestFriend' => ['name' => 'Joe']*/];
$author3 = ['name' => 'Joe'/*, 'bestFriend' => ['name' => 'Jane']*/];
$author4 = ['name' => 'Dirk'/*, 'bestFriend' => ['name' => 'John']*/];
$author1 = ['name' => 'John'/* , 'bestFriend' => ['name' => 'Dirk'] */];
$author2 = ['name' => 'Jane'/* , 'bestFriend' => ['name' => 'Joe'] */];
$author3 = ['name' => 'Joe'/* , 'bestFriend' => ['name' => 'Jane'] */];
$author4 = ['name' => 'Dirk'/* , 'bestFriend' => ['name' => 'John'] */];

$story1 = ['title' => 'Story #8', 'author' => $author1];
$story2 = ['title' => 'Story #3', 'author' => $author3];
Expand Down
3 changes: 1 addition & 2 deletions tests/Type/EnumTypeTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -348,8 +348,7 @@ public function testDoesNotAcceptValuesWithIncorrectCasing(): void
'{ colorEnum(fromEnum: green) }',
null,
[
// Improves upon the reference implementation
spawnia marked this conversation as resolved.
Show resolved Hide resolved
'message' => 'Value "green" does not exist in "Color" enum. Did you mean the enum value "GREEN"?',
'message' => 'Value "green" does not exist in "Color" enum. Did you mean the enum value "GREEN" or "RED"?',
'locations' => [new SourceLocation(1, 23)],
]
);
Expand Down
2 changes: 1 addition & 1 deletion tests/Type/ResolveInfoTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -322,7 +322,7 @@ public function testMergedFragmentsFieldSelection(): void
'url' => true,
],
'replies' => [
'body' => true, //this would be missing if not for the fix https://github.com/webonyx/graphql-php/pull/98
'body' => true, // this would be missing if not for the fix https://github.com/webonyx/graphql-php/pull/98
'author' => [
'id' => true,
'name' => true,
Expand Down
4 changes: 2 additions & 2 deletions tests/Utils/BreakingChangesFinderTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ public function setUp(): void
]);
}

//DESCRIBE: findBreakingChanges
// DESCRIBE: findBreakingChanges

/**
* @see it('should detect if a type was removed or not')
Expand Down Expand Up @@ -1769,7 +1769,7 @@ public function testShouldDetectIfATypeWasAddedToAUnionType(): void
],
]);
// logially equivalent to type1; findTypesRemovedFromUnions should not
//treat this as different than type1
// treat this as different than type1
$type1a = new ObjectType([
'name' => 'Type1',
'fields' => [
Expand Down
Loading