Skip to content

Commit

Permalink
feat: add native documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
null8626 committed Jun 21, 2024
1 parent 3d61e40 commit b7eefdd
Show file tree
Hide file tree
Showing 9 changed files with 245 additions and 4 deletions.
6 changes: 5 additions & 1 deletion .gitattributes
Original file line number Diff line number Diff line change
@@ -1,18 +1,22 @@
.github/** eol=lf
.gitignore eol=lf
.gitmodules eol=lf
**/*.bin binary
**/*.wasm binary
**/*.jar binary
**/*.zip binary
**/*.h eol=lf
.prettierrc.js linguist-vendored=true eol=lf
**/CMakeLists.txt linguist-vendored=true eol=lf
**/Doxyfile linguist-vendored=true eol=lf
**/*.css linguist-vendored=true eol=lf
**/*.ts linguist-vendored=true eol=lf
**/*.mjs linguist-vendored=true eol=lf
**/*.cjs linguist-vendored=true eol=lf
**/*.html linguist-vendored=true eol=lf
**/*.java eol=lf
**/*.rc linguist-vendored=true eol=lf
bindings/native/ linguist-vendored=true eol=lf
**/*.java eol=lf
**/*.rs eol=lf
bindings/node/src/lib.js eol=lf
**/*.md eol=lf
Expand Down
15 changes: 14 additions & 1 deletion .github/workflows/CI.yml
Original file line number Diff line number Diff line change
Expand Up @@ -674,17 +674,30 @@ jobs:
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
if: ${{ always() && (needs.setup.outputs.release != 'null' || needs.setup.outputs.wasm_affected == 'true') }}
if: ${{ always() && (needs.setup.outputs.release != 'null' || needs.setup.outputs.wasm_affected == 'true' || needs.setup.outputs.native_affected == 'true') }}
steps:
- uses: actions/checkout@v4
with:
submodules: recursive
token: ${{ secrets.GITHUB_TOKEN }}
- uses: actions/setup-node@v4
with:
node-version: latest
- uses: ssciwr/doxygen-install@v1
- name: Download wasm artifact
uses: actions/download-artifact@v4
with:
name: wasm
path: bindings/wasm/bin
- name: Move wasm example file
run: mv ./bindings/wasm/example.html ./wasm_example.html
shell: bash
- name: Generate native library documentation
working-directory: bindings/native/docs
run: |
node docgen.mjs
mv ./html ../../../native_docs
shell: bash
- name: Setup GitHub Pages
uses: actions/configure-pages@v5
- name: Delete GitHub Pages deployment history
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@
bindings/java/.gradle/
bindings/java/build/
bindings/java/gradle.properties
bindings/native/docs/package.json
bindings/native/docs/html/
bindings/native/docs/xml/
bindings/native/tests/*.c
bindings/native/tests/build/
bindings/native/docs/
Expand Down
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "bindings/native/docs/doxygen-awesome-css"]
path = bindings/native/docs/doxygen-awesome-css
url = https://github.com/jothepro/doxygen-awesome-css.git
7 changes: 6 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,7 @@ Tip: You can shrink the size of the resulting jar file by removing binaries in t
</details>
<details>
<summary><b>C/C++</b></summary>
<!---[ end, begin DECANCER_NATIVE ]--->

### Download

Expand Down Expand Up @@ -305,7 +306,7 @@ console.log(cured.toString())
</html>
```

[See this in action here.](https://null8626.github.io/decancer)
[See this in action here.](https://null8626.github.io/decancer/wasm_example.html)

</details>
<details>
Expand Down Expand Up @@ -348,6 +349,9 @@ public class Program {
</details>
<details>
<summary><b>C/C++</b></summary>
<!---[ end, begin DECANCER_NATIVE ]--->

For more information, please read the [documentation](https://null8626.github.io/native_docs/index.html).

UTF-8 example:

Expand Down Expand Up @@ -438,6 +442,7 @@ END:
}
```

<!---[ end, begin DECANCER_GLOBAL ]--->
</details>
<!---[ end ]--->

Expand Down
167 changes: 167 additions & 0 deletions bindings/native/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
<!-- WARNING: this markdown file is computer generated.
please modify the README.md file in the root directory instead. -->

# decancer [![npm][npm-image]][npm-url] [![crates.io][crates-io-image]][crates-io-url] [![npm downloads][npm-downloads-image]][npm-url] [![crates.io downloads][crates-io-downloads-image]][crates-io-url] [![codacy][codacy-image]][codacy-url] [![ko-fi][ko-fi-brief-image]][ko-fi-url]

[crates-io-image]: https://img.shields.io/crates/v/decancer?style=flat-square
[crates-io-downloads-image]: https://img.shields.io/crates/d/decancer?style=flat-square
[crates-io-url]: https://crates.io/crates/decancer
[npm-image]: https://img.shields.io/npm/v/decancer.svg?style=flat-square
[npm-url]: https://npmjs.org/package/decancer
[npm-downloads-image]: https://img.shields.io/npm/dt/decancer.svg?style=flat-square
[codacy-image]: https://app.codacy.com/project/badge/Grade/d740b1aa867d42f2b37eb992ad73784a
[codacy-url]: https://app.codacy.com/gh/null8626/decancer/dashboard
[ko-fi-brief-image]: https://img.shields.io/badge/donations-ko--fi-red?color=ff5e5b&style=flat-square
[ko-fi-image]: https://ko-fi.com/img/githubbutton_sm.svg
[ko-fi-url]: https://ko-fi.com/null8626

A library that removes common unicode confusables/homoglyphs from strings.

- Its core is written in [Rust](https://www.rust-lang.org) and utilizes a form of [**Binary Search**](https://en.wikipedia.org/wiki/Binary_search_algorithm) to ensure speed!
- By default, it's capable of filtering **221,529 (19.88%) different unicode codepoints** like:
- All [whitespace characters](https://en.wikipedia.org/wiki/Whitespace_character)
- All [diacritics](https://en.wikipedia.org/wiki/Diacritic), this also eliminates all forms of [Zalgo text](https://en.wikipedia.org/wiki/Zalgo_text)
- Most [leetspeak characters](https://en.wikipedia.org/wiki/Leet)
- Most [homoglyphs](https://en.wikipedia.org/wiki/Homoglyph)
- Several emojis
- Unlike other packages, this package is **[unicode bidi-aware](https://en.wikipedia.org/wiki/Bidirectional_text)** where it also interprets right-to-left characters in the same way as it were to be rendered by an application!
- Its behavior is also highly customizable to your liking!

## Installation
### Download

- [Header file](https://raw.githubusercontent.com/null8626/decancer/v3.2.2/bindings/native/decancer.h)
- [Download for ARM64 macOS (11.0+, Big Sur+)](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-aarch64-apple-darwin.zip)
- [Download for ARM64 iOS](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-aarch64-apple-ios.zip)
- [Download for Apple iOS Simulator on ARM6](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-aarch64-apple-ios-sim.zip)
- [Download for ARM64 Android](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-aarch64-linux-android.zip)
- [Download for ARM64 Windows MSVC](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-aarch64-pc-windows-msvc.zip)
- [Download for ARM64 Linux (kernel 4.1, glibc 2.17+)](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-aarch64-unknown-linux-gnu.zip)
- [Download for ARM64 Linux with MUSL](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-aarch64-unknown-linux-musl.zip)
- [Download for ARMv6 Linux (kernel 3.2, glibc 2.17)](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-arm-unknown-linux-gnueabi.zip)
- [Download for ARMv5TE Linux (kernel 4.4, glibc 2.23)](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-armv5te-unknown-linux-gnueabi.zip)
- [Download for ARMv7-A Android](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-armv7-linux-androideabi.zip)
- [Download for ARMv7-A Linux (kernel 4.15, glibc 2.27)](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-armv7-unknown-linux-gnueabi.zip)
- [Download for ARMv7-A Linux, hardfloat (kernel 3.2, glibc 2.17)](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-armv7-unknown-linux-gnueabihf.zip)
- [Download for 32-bit Linux w/o SSE (kernel 3.2, glibc 2.17)](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-i586-unknown-linux-gnu.zip)
- [Download for 32-bit MSVC (Windows 7+)](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-i686-pc-windows-msvc.zip)
- [Download for 32-bit FreeBSD](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-i686-unknown-freebsd.zip)
- [Download for 32-bit Linux (kernel 3.2+, glibc 2.17+)](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-i686-unknown-linux-gnu.zip)
- [Download for PPC64LE Linux (kernel 3.10, glibc 2.17)](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-powerpc64le-unknown-linux-gnu.zip)
- [Download for RISC-V Linux (kernel 4.20, glibc 2.29)](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-riscv64gc-unknown-linux-gnu.zip)
- [Download for S390x Linux (kernel 3.2, glibc 2.17)](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-s390x-unknown-linux-gnu.zip)
- [Download for SPARC Solaris 11, illumos](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-sparcv9-sun-solaris.zip)
- [Download for Thumb2-mode ARMv7-A Linux with NEON (kernel 4.4, glibc 2.23)](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-thumbv7neon-unknown-linux-gnueabihf.zip)
- [Download for 64-bit macOS (10.12+, Sierra+)](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-x86_64-apple-darwin.zip)
- [Download for 64-bit iOS](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-x86_64-apple-ios.zip)
- [Download for 64-bit MSVC (Windows 7+)](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-x86_64-pc-windows-msvc.zip)
- [Download for 64-bit FreeBSD](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-x86_64-unknown-freebsd.zip)
- [Download for 64-bit illumos](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-x86_64-unknown-illumos.zip)
- [Download for 64-bit Linux (kernel 3.2+, glibc 2.17+)](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-x86_64-unknown-linux-gnu.zip)
- [Download for 64-bit Linux with MUSL](https://github.com/null8626/decancer/releases/download/v3.2.2/decancer-x86_64-unknown-linux-musl.zip)

### Building from source

Building from source requires [Rust v1.65 or later](https://rustup.rs/).

```sh
git clone https://github.com/null8626/decancer.git --depth 1
cd decancer/bindings/native
cargo build --release
```

And the binary files should be generated in the `target/release` directory.
## Examples
UTF-8 example:

```c
#include <decancer.h>

#include <string.h>
#include <stdlib.h>
#include <stdio.h>

#define decancer_assert(expr, notes) \
if (!(expr)) { \
fprintf(stderr, "assertion failure at " notes "\n"); \
ret = 1; \
goto END; \
}

int main(void) {
int ret = 0;

// UTF-8 bytes for "vEⓡ𝔂 𝔽𝕌Ňℕy ţ乇𝕏𝓣"
uint8_t input[] = {0x76, 0xef, 0xbc, 0xa5, 0xe2, 0x93, 0xa1, 0xf0, 0x9d, 0x94, 0x82, 0x20, 0xf0, 0x9d,
0x94, 0xbd, 0xf0, 0x9d, 0x95, 0x8c, 0xc5, 0x87, 0xe2, 0x84, 0x95, 0xef, 0xbd, 0x99,
0x20, 0xc5, 0xa3, 0xe4, 0xb9, 0x87, 0xf0, 0x9d, 0x95, 0x8f, 0xf0, 0x9d, 0x93, 0xa3};

decancer_error_t error;
decancer_cured_t cured = decancer_cure(input, sizeof(input), DECANCER_OPTION_DEFAULT, &error);

if (cured == NULL) {
fprintf(stderr, "curing error: %.*s\n", (int)error.message_length, error.message);
return 1;
}

decancer_assert(decancer_contains(cured, "funny", 5), "decancer_contains");

END:
decancer_cured_free(cured);
return ret;
}
```

UTF-16 example:

```c
#include <decancer.h>

#include <string.h>
#include <stdlib.h>
#include <stdio.h>

#define decancer_assert(expr, notes) \
if (!(expr)) { \
fprintf(stderr, "assertion failure at " notes "\n"); \
ret = 1; \
goto END; \
}

int main(void) {
int ret = 0;

// UTF-16 bytes for "vEⓡ𝔂 𝔽𝕌Ňℕy ţ乇𝕏𝓣"
uint16_t input[] = {
0x0076, 0xff25, 0x24e1,
0xd835, 0xdd02, 0x0020,
0xd835, 0xdd3d, 0xd835,
0xdd4c, 0x0147, 0x2115,
0xff59, 0x0020, 0x0163,
0x4e47, 0xd835, 0xdd4f,
0xd835, 0xdce3
};

// UTF-16 bytes for "funny"
uint16_t funny[] = { 0x66, 0x75, 0x6e, 0x6e, 0x79 };

decancer_error_t error;
decancer_cured_t cured = decancer_cure_utf16(input, sizeof(input) / sizeof(uint16_t), DECANCER_OPTION_DEFAULT, &error);

if (cured == NULL) {
fprintf(stderr, "curing error: %.*s\n", (int)error.message_length, error.message);
return 1;
}

decancer_assert(decancer_contains_utf16(cured, funny, sizeof(funny) / sizeof(uint16_t)), "decancer_contains_utf16");

END:
decancer_cured_free(cured);
return ret;
}
```
## Donations

If you want to support my eyes for manually looking at thousands of unicode characters, consider donating! ❤

[![ko-fi][ko-fi-image]][ko-fi-url]
45 changes: 45 additions & 0 deletions bindings/native/decancer.h
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,7 @@
* @brief Prevents decancer from curing all emojis.
*/
#define DECANCER_OPTION_RETAIN_EMOJIS (1 << 21)

/**
* @brief Removes all non-ASCII characters from the result.
*
Expand Down Expand Up @@ -225,6 +226,13 @@
/**
* @brief Represents an error caused by decancer not being able to cure a string.
*
* ```c
* typedef struct {
* const char* message;
* uint8_t message_length;
* } decancer_error_t;
* ```
*
* @see decancer_cure
* @see decancer_cure_utf16
*/
Expand All @@ -243,6 +251,13 @@ typedef struct {
/**
* @brief Represents a UTF-8 encoded keyword. This struct is often used inside an array.
*
* ```c
* typedef struct {
* const uint8_t* string;
* size_t size;
* } decancer_keyword_t;
* ```
*
* @see decancer_find_multiple
* @see decancer_censor_multiple
* @see decancer_replace_multiple
Expand All @@ -262,6 +277,13 @@ typedef struct {
/**
* @brief Represents a UTF-16 encoded keyword. This struct is often used inside an array.
*
* ```c
* typedef struct {
* const uint16_t* string;
* size_t length;
* } decancer_keyword_utf16_t;
* ```
*
* @see decancer_find_multiple_utf16
* @see decancer_censor_multiple_utf16
* @see decancer_replace_multiple_utf16
Expand Down Expand Up @@ -330,6 +352,22 @@ typedef void* decancer_matches_t;
/**
* @brief Represents a translation of a unicode codepoint.
*
* ```c
* typedef struct {
* uint8_t kind;
*
* union {
* uint32_t character;
*
* struct {
* const uint8_t* contents;
* size_t size;
* void* __heap;
* } string;
* } contents;
* } decancer_translation_t;
* ```
*
* @see decancer_cure_char
* @see decancer_translation_init
* @see decancer_translation_clone
Expand Down Expand Up @@ -393,6 +431,13 @@ typedef void* decancer_cured_t;
/**
* @brief Represents a match in UTF-8 indices.
*
* ```c
* typedef struct {
* size_t start;
* size_t end;
* } decancer_match_t;
* ```
*
* @see decancer_find
* @see decancer_find_utf16
* @see decancer_matcher_consume
Expand Down
File renamed without changes.
3 changes: 2 additions & 1 deletion scripts/ci_setup_pages.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,9 @@ import { fileURLToPath } from 'node:url'
const ROOT_DIR = join(dirname(fileURLToPath(import.meta.url)), '..')
const MINIFIED_JS = join(ROOT_DIR, 'bindings', 'wasm', 'bin', 'decancer.min.js')
const EXCLUDED = [
'index.html',
'wasm_example.html',
['bindings', 'wasm', 'bin'],
['native_docs'],
['scripts'],
['.git']
]
Expand Down

0 comments on commit b7eefdd

Please sign in to comment.