You are here

Union Database Matching Algorithms

Union Database Matching Algorithms

For more information on standard bibliographic number matching and text-based matching, including validation notes, please choose a union database below.

  • BC Union Catalogue (formerly OutLook Online)

  • ELN Media Cat (retired in April 2021)

  • ELN Serials Cat (retired in April 2021)

BC Union Catalogue (formerly OutLook Online)

  • STANDARD BIBLIOGRAPHIC NUMBER MATCH

    022 - ISSN withCrossreferencing (match on first occurrence - hyphens and spaces are ignored, upper case letters are normalized)

    020 - ISBN withCrossreferencing (match on first occurrence - hyphens and spaces are ignored, upper case letter are normalized)

    010 - LCCN withCrossreferencing (hyphens and spaces are ignored, upper case letter are normalized)

    Validation on all number matches on:

    245 $a 10 characters (normalize upper case letters, special characters are ignored; check field 008 position 35-37 for Language to decide which article to drop in the normalization of field 245)

  • TEXT-BASED MATCH

    245 $a (all characters) use filing indicator concatenated with $b (all characters) normalize upper case letters

    Validate Leader position 6 and 7 (mandatory)

    Validate 245 $h first 10 characters; $p first 10 characters as separate validation points, no concatenation

    Validate 100, 110, 111 $a first 100 characters (hyphens and spaces are ignored, and upper case letters are normalized if more than one author tag, lowest numbered tag is used)

    Validate 260 $b keyword search of first 5 words; match any one word; 260 $c match year with any of first 5 years

  • NOTES:

    • Absence of a VALIDATION subfield in any match point WILL inhibit an otherwise valid match.

    • We are recommending that the leader check be dropped for numeric matches.

ELN Media Cat (retired in April 2021)

  • STANDARD BIBLIOGRAPHIC NUMBER MATCH

    010 - LCCN withCrossreferencing (hyphens and spaces are ignored, upper case letters are normalized)
    020 - ISBN withCrossreferencing (match on first occurrence, hyphens and spaces are ignored, upper case letters are normalized)
    022 - ISSN withCrossreferencing (match on first occurrence, hyphens and spaces are ignored, upper case letters are normalized
    028 40 - Publisher's number (match on first occurrence)

    Validation of all number matches on the following:
    Leader position 6 and 7
    245 $a 15 characters (normalize upper case letters, special characters are ignored; checks field 008 position 35-37 for Language to decide which article to drop in the normalization of field 245)

  • TEXT-BASED MATCH

    245 $a (all characters) $b 15 characters

    Validate leader position 6 and 7

    Validate 245 $h 7 characters

    Validate 245 $p 15 characters

    Validate 260 $b Publisher's name (first 5 words; match any one word), 260 $c Publisher's Date (first 5 dates)

    Validate 440 $a 15 characters (normalizeFirstArticle)

    Validate 710 $a string of first 15 characters. A valid match is when this set of characters appears in any of a record's 710 field.

    Please Note: This does not mean that "all" 710 fields must match, rather only that the first 15 characters from one of the 710 fields matches.

  • NOTES:

    • Validation on 008 positions 23 and 35-37 is optional for TEXT-BASED matches. ELN has chosen not to include this matching criteria.

    • The program uses the 245 filing indicator. To safeguard against incorrect usage, however, the program has a stop list which includes "the" and "an". This stop list is applied to all 245 subfields, excluding $h.

    • The stop list "the" and "an" is also applied to the 4xx match.

    • For 245 $b and $p, the absence of a subfield in merging records will not inhibit an otherwise valid match.

ELN Serials Cat (retired in April 2021)

  • STANDARD BIBLIOGRAPHIC NUMBER MATCH

    022 - ISSN withCrossreferencing (match on first occurrence - hyphens and spaces are ignored, upper case letters are normalized)

    020 - ISBN withCrossreferencing (match on first occurrence - hyphens and spaces are ignored, upper case letters are normalized

    Validation of all number matches on:

    245 $a 10 characters (normalize upper case letters, special characters are ignored; checks field 008 position 35-37 for Language to decide which article to drop in the normalization of field 245)
     
  • TEXT-BASED MATCH

    245 - all characters $a and $b

    Validate 260 $b Publisher's name (first 5 words; match any one word); 260 $c Publisher's Date (first 5 dates)

    Validate 245 $h - first 10 characters

    Validate 100 $a - first 100 characters
     
  • NOTES:
    • Absence of a VALIDATION subfield for any matches in the Serials database 130 or 240 (uniform title) match would be more accurate than 245, however matching keys are predefined for the following field only: 001, 035, 010, 020, 022, 028 (with indicators 40 only), 245.