-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
5 changed files
with
172 additions
and
12 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
# Contributor Covenant Code of Conduct | ||
|
||
## Our Pledge | ||
|
||
In the interest of fostering an open and welcoming environment, we as | ||
contributors and maintainers pledge to making participation in our project and | ||
our community a harassment-free experience for everyone, regardless of age, body | ||
size, disability, ethnicity, sex characteristics, gender identity and expression, | ||
level of experience, education, socio-economic status, nationality, personal | ||
appearance, race, religion, or sexual identity and orientation. | ||
|
||
## Our Standards | ||
|
||
Examples of behavior that contributes to creating a positive environment | ||
include: | ||
|
||
* Using welcoming and inclusive language | ||
* Being respectful of differing viewpoints and experiences | ||
* Gracefully accepting constructive criticism | ||
* Focusing on what is best for the community | ||
* Showing empathy towards other community members | ||
|
||
Examples of unacceptable behavior by participants include: | ||
|
||
* The use of sexualized language or imagery and unwelcome sexual attention or | ||
advances | ||
* Trolling, insulting/derogatory comments, and personal or political attacks | ||
* Public or private harassment | ||
* Publishing others' private information, such as a physical or electronic | ||
address, without explicit permission | ||
* Other conduct which could reasonably be considered inappropriate in a | ||
professional setting | ||
|
||
## Our Responsibilities | ||
|
||
Project maintainers are responsible for clarifying the standards of acceptable | ||
behavior and are expected to take appropriate and fair corrective action in | ||
response to any instances of unacceptable behavior. | ||
|
||
Project maintainers have the right and responsibility to remove, edit, or | ||
reject comments, commits, code, wiki edits, issues, and other contributions | ||
that are not aligned to this Code of Conduct, or to ban temporarily or | ||
permanently any contributor for other behaviors that they deem inappropriate, | ||
threatening, offensive, or harmful. | ||
|
||
## Scope | ||
|
||
This Code of Conduct applies both within project spaces and in public spaces | ||
when an individual is representing the project or its community. Examples of | ||
representing a project or community include using an official project e-mail | ||
address, posting via an official social media account, or acting as an appointed | ||
representative at an online or offline event. Representation of a project may be | ||
further defined and clarified by project maintainers. | ||
|
||
## Enforcement | ||
|
||
Instances of abusive, harassing, or otherwise unacceptable behavior may be | ||
reported by contacting the project team at [email protected]. All | ||
complaints will be reviewed and investigated and will result in a response that | ||
is deemed necessary and appropriate to the circumstances. The project team is | ||
obligated to maintain confidentiality with regard to the reporter of an incident. | ||
Further details of specific enforcement policies may be posted separately. | ||
|
||
Project maintainers who do not follow or enforce the Code of Conduct in good | ||
faith may face temporary or permanent repercussions as determined by other | ||
members of the project's leadership. | ||
|
||
## Attribution | ||
|
||
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, | ||
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html | ||
|
||
[homepage]: https://www.contributor-covenant.org | ||
|
||
For answers to common questions about this code of conduct, see | ||
https://www.contributor-covenant.org/faq |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
## Template when making a new issue | ||
|
||
Please, make sure that the following boxes are checked before submitting a new issue. There is a small chance that you can solve it by your own or even that it was already addressed by someone. | ||
|
||
Thank you! | ||
|
||
### Pre-checkings | ||
|
||
- [ ] Check that you are up-to-date with the master branch of NALP. You can update with: | ||
pip install git+git://github.com/gugarosa/nalp.git --upgrade --no-deps | ||
|
||
- [ ] Check that you have read all of our [README](https://github.com/gugarosa/nalp/blob/master/README.md). | ||
|
||
### Description | ||
|
||
[Description of the issue] | ||
|
||
### Link | ||
[Provide a link to a GitHub Gist of a Python script that can reproduce your issue, or just copy here] | ||
|
||
### Steps to Reproduce | ||
|
||
1. [First Step] | ||
2. [Second Step] | ||
3. [...] | ||
|
||
**Expected behavior:** [It should be what you expect to happen] | ||
|
||
**Actual behavior:** [What actually happens] | ||
|
||
**Reproduces how often:** [How much does it occur? Show us your percentage] | ||
|
||
### Additional Information | ||
|
||
Any additional information, configuration or data that might be necessary to reproduce the issue. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,9 @@ | ||
# NALP: Natural Adversarial Language Processing | ||
|
||
[![Latest release](https://img.shields.io/github/release/gugarosa/nalp.svg)](https://github.com/gugarosa/nalp/releases) | ||
[![Open issues](https://img.shields.io/github/issues/gugarosa/nalp.svg)](https://github.com/gugarosa/nalp/issues) | ||
[![License](https://img.shields.io/github/license/gugarosa/nalp.svg)](https://github.com/gugarosa/nalp/blob/master/LICENSE) | ||
|
||
## Welcome to NALP. | ||
Have you ever wanted to created natural text from raw sources? If yes, NALP is for you! This package is an innovative way of dealing with natural language processing and adversarial learning. From bottom to top, from embeddings to neural networks, we will foster all research related to this newly trend. | ||
|
||
|
@@ -22,8 +26,6 @@ NALP is compatible with: **Python 2.7-3.6**. | |
3. Note that there might be some **additional** steps in order to use our solutions. | ||
4. If there is a problem, please do not **hesitate**, call us. | ||
|
||
|
||
|
||
--- | ||
|
||
## Getting started: 60 seconds with NALP | ||
|
@@ -41,35 +43,80 @@ NALP is based on the following structure, and you should pay attention to its tr | |
- encoder | ||
- neural | ||
- datasets | ||
- one_hot | ||
- vanilla | ||
- encoders | ||
- count | ||
- tfidf | ||
- word2vec | ||
- neurals | ||
- rnn | ||
- stream | ||
- loader | ||
- preprocess | ||
- utils | ||
- decorators | ||
- logging | ||
- splitters | ||
- visualization | ||
``` | ||
|
||
### Core | ||
|
||
Core is the core. Essentially, it is the parent of everything. You should find parent classes defining the basic of our structure. They should provide variables and methods that will help to construct other modules. It is composed by the following classes: | ||
|
||
1. Dataset (used to handle receiving (can be raw of pre-processed) data and preparing it for further neural package methods) | ||
```dataset```: Used to handle receiving (can be raw of pre-processed) data and preparing it for further neural package methods. | ||
|
||
2. Encoder (You can use different pre-stablished encoders as well. They should provide that matrix you were wanting all the time. For example: CountVectorizer, TF-IDF and Word2Vec.) | ||
```encoder```: You can use different pre-stablished encoders as well. They should provide that matrix you were wanting all the time. For example: CountVectorizer, TF-IDF and Word2Vec. | ||
|
||
3. Neural (This is the brain of the system. It will hold all the high-level methods in order to interact directly from tensorflow. That is it. No Keras. We use tensorflow. RAW. We like to learn and believe that machine learning is mathematics. TF proved to be great for us.) | ||
```neural```: This is the brain of the system. It will hold all the high-level methods in order to interact directly from tensorflow. That is it. No Keras. We use tensorflow. RAW. We like to learn and believe that machine learning is mathematics. TF proved to be great for us. | ||
|
||
### Datasets | ||
|
||
Because we need data, right? Datasets are composed by classes and methods that allow to prepare data for further neural networks. | ||
|
||
```one_hot```: An one hot encoding for dataset. This serves as a basis for predicting next characters or words. | ||
|
||
```vanilla```: A vanilla dataset, used to load (X, Y) sample pairs. One can see as a basic structure for common datasets, where each sample is composed by features and labels. | ||
|
||
### Encoders | ||
|
||
Text or Numbers? Encodings are used to make embeddings. Embeddings are used to feed into neural networks. Remember that networks cannot read raw data, therefore you might want to pre-encode your data using well-known encoders. | ||
|
||
```count```: CountVectorizer encoding. | ||
|
||
```tfidf```: TF-IDF encoding. | ||
|
||
```word2vec```: Word2Vec encoding. | ||
|
||
### Neurals | ||
|
||
A neural networks package. In this package you can find all neural-related implementations. From naïve RNNs to BiLSTMs, you can use whatever suits your needs. All implementations were done using raw Tensorflow, mainly to better understand and control the whole training and inference process. | ||
|
||
```rnn```: A naïve Recurrent Neural Network implementation. | ||
|
||
### Stream | ||
|
||
A stream package is used to manipulate data. From loading to processing, here you can find all classes and methods defined in order to help you achieve these tasks. | ||
|
||
```loader```: Loading module, used to load external text data. | ||
|
||
```preprocess```: Pre-processing module, used to pre-process, tokenize and many other methods. | ||
|
||
### Utils | ||
|
||
This is an utilities package. Common things shared across the application should be implemented here. It is better to implement once and use as you wish than re-implementing the same thing over and over again. | ||
|
||
```decorators```: Decorators used to facilitate repetitive declarations. | ||
|
||
```logging```: Logging tools to track the progress of a NALP task. | ||
|
||
```splitters```: A data splitter tool. It might be removed in the future. | ||
|
||
### Visualization | ||
|
||
A visualization package in order to better illustrate what is happening with your data. Use classes and methods to help you decide if your data is well enough to fulfill your desires. | ||
|
||
--- | ||
|
||
## Installation | ||
|
@@ -102,6 +149,6 @@ No specific additional commands needed. | |
|
||
## Support | ||
|
||
We know that we do our best, but it's inevitable to acknowlodge that we make mistakes. If you every need to report a bug, report a problem, talk to us, please do so! We will be avaliable at our bests at this repository or recogna@fc.unesp.br. | ||
We know that we do our best, but it's inevitable to acknowlodge that we make mistakes. If you every need to report a bug, report a problem, talk to us, please do so! We will be avaliable at our bests at this repository or [email protected]. | ||
|
||
--- |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,16 +8,15 @@ | |
author_email='[email protected]', | ||
url='https://github.com/gugarosa/nalp', | ||
license='MIT', | ||
python_requires='>=3.5', | ||
install_requires=['gensim>=3.5.0', | ||
'matplotlib>=3.0.0', | ||
'numpy>=1.13.3', | ||
'nltk>=3.2.5', | ||
'numpy>=1.13.3', | ||
'pandas>=0.23.4', | ||
'scikit-learn>=0.19.2', | ||
'scipy>=1.1.0', | ||
'pylint>=1.7.4', | ||
'pytest>=3.2.3', | ||
'scikit-learn>=0.19.2', | ||
'scipy>=1.1.0', | ||
], | ||
extras_require={ | ||
'tests': ['pytest', | ||
|
@@ -29,8 +28,9 @@ | |
'Intended Audience :: Developers', | ||
'Intended Audience :: Education', | ||
'Intended Audience :: Science/Research', | ||
'License :: OSI Approved :: Apache-2.0 License', | ||
'Programming Language :: Python :: 3.5', | ||
'License :: OSI Approved :: MIT License', | ||
'Programming Language :: Python :: 2.7', | ||
'Programming Language :: Python :: 3.6', | ||
'Topic :: Software Development :: Libraries', | ||
'Topic :: Software Development :: Libraries :: Python Modules' | ||
], | ||
|