my fork of dmp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
# diff-match-patch-rs: Efficient port of Google's diff-match-patch implemented in Rust

A very **fast**, **accurate** and **wasm ready** port of [Diff Match Patch](https://github.com/dmsnell/diff-match-patch) in Rust. The
diff implementation is based on [Myers' diff algorithm](https://neil.fraser.name/writing/diff/myers.pdf).


## Highlights of this crate
- Exposes two modes of operating with `diff-match-patch`, a `Efficient` mode and `Compat` mode. While the `Efficient` mode squeezes out the max performance the `Compat` mode ensures compatibility across other libraries or implementations (rust or otherwise). According to [Benchmarks](#benchmarks), our slower `Compat` mode is still faster than other implementations in rust.
    - `**Efficient**` mode works on `&[u8]` and the generated diffs break compatibility with other implementation. Use the `**Efficient**` mode ONLY if you are using this `[crate]()` at the source of diff generation and the destination.
    - `**Compat**` mode on the other hand works on `&[char]` and the generated `diffs` and `patches` are compatible across other implementations of `diff-match-patch`. Checkout `tests/compat.rs` for some test cases around this.
-  `wasm` ready, you can check out a `[demo here](https://github.com/AnubhabB/wasm-diff.git)`
- **Accurate**, can't believe I have to write this but during the course of this implementation I've realised there are a bunch of implementations that have some implementation issues (wrong diffs, inaccurate flows, silent errors etc.).
- Helper method **pretty_html** provided by this crate is a bit more advanced and allows some configurations to control the generated visuals elements.
- Well tested
- Added a `fuzzer` for sanity


## Usage Examples

### `Effitient` mode

```rust
use diff_match_patch_rs::{DiffMatchPatch, Efficient, Error, PatchInput};

// An example of a function that creates a diff and returns a set of patches serialized
fn at_source() -> Result<String, Error> {
    // initializing the module
    let dmp = DiffMatchPatch::new();

    // create a list of diffs
    let diffs = dmp.diff_main::<Efficient>(TXT_OLD, TXT_NEW)?;

    // Now, we are going to create a list of `patches` to be applied to the old text to get the new text
    let patches = dmp.patch_make(PatchInput::new_diffs(&diffs))?;

    // in the real world you are going to transmit or store this diff serialized to undiff format to be consumed or used somewhere elese
    let patch_txt = dmp.patch_to_text(&patches);

    Ok(patch_txt)
}

fn at_destination(patches: &str) -> Result<(), Error> {
    // initializing the module
    let dmp = DiffMatchPatch::new();

    // lets recreate the diffs from patches
    let patches = dmp.patch_from_text::<Efficient>(patches)?;

    // Now, lets apply these patches to the `old_txt` which is the original to get the new text
    let (new_txt, ops) = dmp.patch_apply(&patches, TXT_OLD)?;

    // Lets print out if the ops succeeded or not
    ops.iter()
        .for_each(|&o| println!("{}", if o { "OK" } else { "FAIL" }));

    // If everything goes as per plan you should see
    // OK
    // OK
    // ... and so on

    // lets check out if our `NEW_TXT` (presumably the edited one)
    if new_txt != TXT_NEW {
        return Err(Error::InvalidInput);
    }

    println!("Wallah! Patch applied successfully!");

    Ok(())
}

fn main() -> Result<(), Error> {
    // At the source of diff where the old text is being edited we'll create a set of patches
    let patches = at_source()?;

    // We'll send this diff to some destination e.g. db or the client where these changes are going to be applied

    // The destination will receive the patch string and will apply the patches to recreate the edits
    at_destination(&patches)
}

```


## Benchmarks
Benchmarks are maintained [diff-match-patch-bench repository](https://github.com/AnubhabB/diff-match-patch-rs-bench)

| Lang.   | Library                                                                                  | Diff Avg. | Patch Avg. | Bencher    | Mode        | Correct |
|:-------:|:----------------------------------------------------------------------------------------:|:---------:|:----------:|:----------:|:-----------:|:-------:|
| `rust`  | [diff_match_patch v0.1.1<sup>**</sup>](https://crates.io/crates/diff_match_patch)        | 68.108 ms | 10.596 ms | Criterion   | -           |    ✅   |
| `rust`  | [diffmatchpatch v0.0.4<sup>***</sup>](https://crates.io/crates/diffmatchpatch)           | 66.454 ms | -         | Criterion   | -           |    ❌   |
| `rust`  | [dmp v0.2.0](https://crates.io/crates/dmp)                                               | 69.019 ms | 14.654 ms | Criterion   | -           |    ✅   |
| `rust`  | [diff-match-patch-rs](https://github.com/AnubhabB/diff-match-patch-rs.git)<sup>our</sup> | 65.487 ms | 631.13 µs | Criterion   | `Efficient` |    ✅   |
| `rust`  | [diff-match-patch-rs](https://github.com/AnubhabB/diff-match-patch-rs.git)<sup>our</sup> | 65.642 ms | 1.1703 ms | Criterion   | `Compat`    |    ✅   |
| `go`    | [go-diff<sup>*</sup>](https://github.com/sergi/go-diff)                                  | 50.31 ms  | 135.2 ms  | go test     | -           |    ❌   |
| `node`  | [diff-match-patch](https://www.npmjs.com/package/diff-match-patch)                       | 246.90 ms | 1.07 ms   | tinybench   | -           |    ✅   |
| `python`| [diff-match-patch](https://pypi.org/project/diff-match-patch/)                           | 1.01 s    | 0.25 ms   | timeit      | -           |    ✅   |

>
> Note:
> Omitting [dissimilar](https://crates.io/crates/dissimilar) from the results, I believe that crate has different goals and a headon benchmark is not fair
> Results: Avg[197.30] High[197.46] Low[197.19]

>
> `*` [go-diff](https://github.com/sergi/go-diff) seems to generate wrong diffs for emoticons. This benchmark is on the text with the emoticons removed. <br>
> `**` Adds an extra clone to the iterator because the `patch_apply` method takes mutable refc. to `diffs` <br>
> `***` The crate [diffmatchpatch v0.0.4](https://crates.io/crates/diffmatchpatch) is still a WIP, cound't find the `patch_apply` method <br>

## Related projects

Diff Match Patch was originally built in 2006 to power Google Docs.
- [Diff Match Patch](https://github.com/google/diff-match-patch) (and it's [fork](https://github.com/dmsnell/diff-match-patch))
- **Rust**: [Distil.io diff_match_patch](https://crates.io/crates/diff_match_patch)
- **Rust**: [dmp](https://crates.io/crates/dmp)
- **Rust**: [Dissimilar](https://crates.io/crates/dissimilar) by the awesome [David Tolnay](https://github.com/dtolnay)
- **Rust**: [diff_match_patch](https://crates.io/crates/diff_match_patch)