An Organization’s biggest asset is their data about their customers and resources. Banks have a plethora of information about their customers’ transactions, choices and inclinations. The best application of this information would be to enhance the customer experience through gaining incremental insights about them. To achieve this objective, Organizations need a strong IT infrastructure to support data management, analytics, data governance and data quality. With the best of both words (IT Infrastructure and Business Knowledge), an organization can build strong protocols for Master Data Management. This level of practice enables a savvy company to link all of its critical data to one file (a master) that provides a common point of reference. Overall Master Data Management streamlines data sharing among departments and groups.
Any successful marketing campaign requires clean contacts for its customers. Name and address are the critical contact components for a customer. If these basic building blocks are not organized and cleansed, they will lead to potential issues like duplicate communications, improper householding and improper salutation. Standardization, parsing, casing, verification and enrichment is the collective approach for having standardized customer names and addresses for Master Data Management and Analytics.
Name Standardization
There can be many problems while standardizing name and address information. Some of the most common problems and their respective approaches are described below.
Business Scenarios for Name Standardization:
- Individual and Organization names combined and incorrectly classified
- Presence of abbreviations, numbers, apostrophe, special characters, improper casing
- Differently typed name prefix (Mister vs. Mr. etc.)
- Names of different country of origin, mixed locale
- Non-name words (e.g. of, the, dated, from, company specific words)
- Multiple middle and last names
Though there are many ways to achieve the desired quality output for standardized names, some proven one are mentioned below:
- Use identification analysis along with other customized approached to correctly classify an entity as an organization or an individual. This helps standardize names differently.
- Use multiple standardization definitions to separate unnecessary words (special characters, numbers, company specific words, filler words etc.) from names which can cause improper standardization.
- Define and apply customized schemes to standardize differently spelled words to common words (e.g. Assoc, Assocn. and Assn. to Association).
- Use different locales to standardize names properly.
- Use single and multiple name parsing steps to separate parts from the names (prefix, first, middle, last, suffix and title).
Below are some examples for standardized names:
Original Name | Standardized Name | Parsed Name | |||||
Prefix | First Name | Middle Name | Last Name | Suffix | Title | ||
JOSEPH K O’NEIL | Joseph K O’Neil | Joseph | K | O’Neil | |||
KATHERINE HELENE DORASAVAGE PALKO | Katherine Helene Dorasavage Palko | Katherine | Helene | Dorasavage Palko | |||
MISTER ALAN J. SIMPSON SENIOR | Mr. Alan J. Simpson, Sr. | Mr. | Alan | J. | Simpson | Sr. | |
DR. GEORGE ERICKSON PHD | Dr. George Erickson, PhD | Dr. | George | Erickson | PhD | ||
LENEA GLOVER 1997 | Lenea Glover, 1997 |
Address Standardization
Business Scenarios for Address Standardization:
- Non-standard street words (Street / Str / St / St. etc.)
- Missing state, city, country, postal code information
- Presence of abbreviations, numbers, apostrophe, special characters, improper casing
- Addresses from different countries
- Addresses containing PO Boxes, Armed Force Addresses
- Person or organization names attached along with the address
- Change of address
Address standardization can be tricky because of missing or improper information. Below are some useful approaches:
- Subscribe to reliable address verification services (e.g. USPS for address in The United States) to verify and correct the address with multiple locale support.
- Use standardization definitions for automatic standardization, corrections and cleaning.
- Define and apply customized schemes to standardize differently spelled words to common words (e.g. Street, Str, Str. to Street).
- Separate any non-address type information from address lines.
- Apply parsing algorithms to parse the different components of any address (e.g. building number, street name, street type, unit type, unit number, landmark etc.).
Below are some examples for standardized addresses:
Original Address | Standardized Address |
HC 27 BOX 215, HWY 7 NORTH Jasper, AR 72641 | HC 27 Box 215 Jasper, AR 72641-9511 |
ANDREW R NOE DANIELLE D NOE PSC 46 BOX 914 APO ARMED FORCES EUROPE 094690010 | PSC 46 Box 914 APO, AE 09469-0010 |
D6 CALLE A, JUNCOS, PR, 7772902 | D6 Calle A, Brisas del Prado, PR, 00777-2902 |
15406 YELLOW BLUFF RD, JACKSONVILLE, FL, 322261131 | 15407 Herbie Ln, Jacksonville, FL, 32218-8530 |
501 WALNUT STREET, CATASUQUA, PA | 501 Walnut St, Catasauqua, PA, 18032-1706 |
Conclusion
With new data and data sources getting introduced, it’s critical to face new challenges which have not surfaced before. Hence, improving Data Quality and Master Data Management is an ongoing process through design, planning and monitoring cycles. To achieve the desired result, one needs to work iteratively with IT and Business departments to understand the various scenarios and then come up with a comprehensive approach. It requires continuous investment of organization resources to continue and improve the state of health of data.