Supply: Unexpected MPNs being returned

Modified on Thu, 10 Apr at 10:02 AM

When searching or matching sometimes additional parts are found where the MPNs differ only by non-alphanumeric characters, such as - or ..

For example, consider a supMultiMatch query for the following MPN:

query MultiMatch {
  supMultiMatch(queries: {mpn: "ASV-18432MHZ-EJ-T", limit: 5}) {
    hits
    parts {
      mpn
    }
  }
}

This may return multiple MPNs such as ASV-18.432MHZ-EJ-T as well as ASV-1.8432MHZ-E-J-T (and others).

This behaviour is due to the way parts are "tokenized" for indexing in the elastic search engine.

MPNs are split into tokens, ignoring special characters (e.g., CV 3-200/SPG might become `CV`, `3200`, `SPG`).

Matching is then done using trigrams (three-letter combinations). The search must fully match a tokenized MPN, but partial matches depend on trigrams. For example:

Match: "CV 3-20", "CV 320", "00 SPG" - Each part aligns with at least one full token

No Match: "CV 3", "CV 3-", "CV 32" - No part of these aligns with any token

If an exact character match is required, the client application should compare the resulting MPNs against the input, and filter post response.