Fuzzy string matching
Data Warehouse · Public preview · Planned
Description
Fabric Data Warehouse will introduce string similarity and comparison functions based on Levenshtein and Jaro-Winkler algorithms. These functions make it easier to find strings that are similar, even when they have small changes or spelling errors.New functions that will be added are:* EDIT_DISTANCE - Returns the number of edits (insertions, deletions, substitutions) needed to transform one string into another.* EDIT_DISTANCE_SIMILARITY - Calculates a similarity score (0-1) based on Levenshtein distance, where 1 means identical strings.* JARO_WINKLER_DISTANCE - Measures the distance between two strings using the Jaro-Winkler algorithm, considering transpositions and common prefixes.* JARO_WINKLER_SIMILARITY - Returns a similarity score (0-1) using Jaro-Winkler, optimized for short strings and minor typos.
Change History
-
2026-05-21
Roadmap Item Added
Workload: Data Warehouse