Skip to content

Commit

Permalink
Ready to publish first release.
Browse files Browse the repository at this point in the history
  • Loading branch information
gugarosa committed Feb 26, 2019
1 parent 376bbd3 commit 4f99ae7
Show file tree
Hide file tree
Showing 5 changed files with 172 additions and 12 deletions.
76 changes: 76 additions & 0 deletions CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Contributor Covenant Code of Conduct

## Our Pledge

In the interest of fostering an open and welcoming environment, we as
contributors and maintainers pledge to making participation in our project and
our community a harassment-free experience for everyone, regardless of age, body
size, disability, ethnicity, sex characteristics, gender identity and expression,
level of experience, education, socio-economic status, nationality, personal
appearance, race, religion, or sexual identity and orientation.

## Our Standards

Examples of behavior that contributes to creating a positive environment
include:

* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members

Examples of unacceptable behavior by participants include:

* The use of sexualized language or imagery and unwelcome sexual attention or
advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic
address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting

## Our Responsibilities

Project maintainers are responsible for clarifying the standards of acceptable
behavior and are expected to take appropriate and fair corrective action in
response to any instances of unacceptable behavior.

Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct, or to ban temporarily or
permanently any contributor for other behaviors that they deem inappropriate,
threatening, offensive, or harmful.

## Scope

This Code of Conduct applies both within project spaces and in public spaces
when an individual is representing the project or its community. Examples of
representing a project or community include using an official project e-mail
address, posting via an official social media account, or acting as an appointed
representative at an online or offline event. Representation of a project may be
further defined and clarified by project maintainers.

## Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by contacting the project team at [email protected]. All
complaints will be reviewed and investigated and will result in a response that
is deemed necessary and appropriate to the circumstances. The project team is
obligated to maintain confidentiality with regard to the reporter of an incident.
Further details of specific enforcement policies may be posted separately.

Project maintainers who do not follow or enforce the Code of Conduct in good
faith may face temporary or permanent repercussions as determined by other
members of the project's leadership.

## Attribution

This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html

[homepage]: https://www.contributor-covenant.org

For answers to common questions about this code of conduct, see
https://www.contributor-covenant.org/faq
35 changes: 35 additions & 0 deletions ISSUE_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
## Template when making a new issue

Please, make sure that the following boxes are checked before submitting a new issue. There is a small chance that you can solve it by your own or even that it was already addressed by someone.

Thank you!

### Pre-checkings

- [ ] Check that you are up-to-date with the master branch of NALP. You can update with:
pip install git+git://github.com/gugarosa/nalp.git --upgrade --no-deps

- [ ] Check that you have read all of our [README](https://github.com/gugarosa/nalp/blob/master/README.md).

### Description

[Description of the issue]

### Link
[Provide a link to a GitHub Gist of a Python script that can reproduce your issue, or just copy here]

### Steps to Reproduce

1. [First Step]
2. [Second Step]
3. [...]

**Expected behavior:** [It should be what you expect to happen]

**Actual behavior:** [What actually happens]

**Reproduces how often:** [How much does it occur? Show us your percentage]

### Additional Information

Any additional information, configuration or data that might be necessary to reproduce the issue.
59 changes: 53 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# NALP: Natural Adversarial Language Processing

[![Latest release](https://img.shields.io/github/release/gugarosa/nalp.svg)](https://github.com/gugarosa/nalp/releases)
[![Open issues](https://img.shields.io/github/issues/gugarosa/nalp.svg)](https://github.com/gugarosa/nalp/issues)
[![License](https://img.shields.io/github/license/gugarosa/nalp.svg)](https://github.com/gugarosa/nalp/blob/master/LICENSE)

## Welcome to NALP.
Have you ever wanted to created natural text from raw sources? If yes, NALP is for you! This package is an innovative way of dealing with natural language processing and adversarial learning. From bottom to top, from embeddings to neural networks, we will foster all research related to this newly trend.

Expand All @@ -22,8 +26,6 @@ NALP is compatible with: **Python 2.7-3.6**.
3. Note that there might be some **additional** steps in order to use our solutions.
4. If there is a problem, please do not **hesitate**, call us.



---

## Getting started: 60 seconds with NALP
Expand All @@ -41,35 +43,80 @@ NALP is based on the following structure, and you should pay attention to its tr
- encoder
- neural
- datasets
- one_hot
- vanilla
- encoders
- count
- tfidf
- word2vec
- neurals
- rnn
- stream
- loader
- preprocess
- utils
- decorators
- logging
- splitters
- visualization
```

### Core

Core is the core. Essentially, it is the parent of everything. You should find parent classes defining the basic of our structure. They should provide variables and methods that will help to construct other modules. It is composed by the following classes:

1. Dataset (used to handle receiving (can be raw of pre-processed) data and preparing it for further neural package methods)
```dataset```: Used to handle receiving (can be raw of pre-processed) data and preparing it for further neural package methods.

2. Encoder (You can use different pre-stablished encoders as well. They should provide that matrix you were wanting all the time. For example: CountVectorizer, TF-IDF and Word2Vec.)
```encoder```: You can use different pre-stablished encoders as well. They should provide that matrix you were wanting all the time. For example: CountVectorizer, TF-IDF and Word2Vec.

3. Neural (This is the brain of the system. It will hold all the high-level methods in order to interact directly from tensorflow. That is it. No Keras. We use tensorflow. RAW. We like to learn and believe that machine learning is mathematics. TF proved to be great for us.)
```neural```: This is the brain of the system. It will hold all the high-level methods in order to interact directly from tensorflow. That is it. No Keras. We use tensorflow. RAW. We like to learn and believe that machine learning is mathematics. TF proved to be great for us.

### Datasets

Because we need data, right? Datasets are composed by classes and methods that allow to prepare data for further neural networks.

```one_hot```: An one hot encoding for dataset. This serves as a basis for predicting next characters or words.

```vanilla```: A vanilla dataset, used to load (X, Y) sample pairs. One can see as a basic structure for common datasets, where each sample is composed by features and labels.

### Encoders

Text or Numbers? Encodings are used to make embeddings. Embeddings are used to feed into neural networks. Remember that networks cannot read raw data, therefore you might want to pre-encode your data using well-known encoders.

```count```: CountVectorizer encoding.

```tfidf```: TF-IDF encoding.

```word2vec```: Word2Vec encoding.

### Neurals

A neural networks package. In this package you can find all neural-related implementations. From naïve RNNs to BiLSTMs, you can use whatever suits your needs. All implementations were done using raw Tensorflow, mainly to better understand and control the whole training and inference process.

```rnn```: A naïve Recurrent Neural Network implementation.

### Stream

A stream package is used to manipulate data. From loading to processing, here you can find all classes and methods defined in order to help you achieve these tasks.

```loader```: Loading module, used to load external text data.

```preprocess```: Pre-processing module, used to pre-process, tokenize and many other methods.

### Utils

This is an utilities package. Common things shared across the application should be implemented here. It is better to implement once and use as you wish than re-implementing the same thing over and over again.

```decorators```: Decorators used to facilitate repetitive declarations.

```logging```: Logging tools to track the progress of a NALP task.

```splitters```: A data splitter tool. It might be removed in the future.

### Visualization

A visualization package in order to better illustrate what is happening with your data. Use classes and methods to help you decide if your data is well enough to fulfill your desires.

---

## Installation
Expand Down Expand Up @@ -102,6 +149,6 @@ No specific additional commands needed.

## Support

We know that we do our best, but it's inevitable to acknowlodge that we make mistakes. If you every need to report a bug, report a problem, talk to us, please do so! We will be avaliable at our bests at this repository or recogna@fc.unesp.br.
We know that we do our best, but it's inevitable to acknowlodge that we make mistakes. If you every need to report a bug, report a problem, talk to us, please do so! We will be avaliable at our bests at this repository or [email protected].

---
2 changes: 2 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,7 @@ matplotlib>=3.0.0
nltk>=3.2.5
numpy>=1.13.3
pandas>=0.23.4
pylint>=1.7.4
pytest>=3.2.3
scikit-learn>=0.19.2
scipy>=1.1.0
12 changes: 6 additions & 6 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,15 @@
author_email='[email protected]',
url='https://github.com/gugarosa/nalp',
license='MIT',
python_requires='>=3.5',
install_requires=['gensim>=3.5.0',
'matplotlib>=3.0.0',
'numpy>=1.13.3',
'nltk>=3.2.5',
'numpy>=1.13.3',
'pandas>=0.23.4',
'scikit-learn>=0.19.2',
'scipy>=1.1.0',
'pylint>=1.7.4',
'pytest>=3.2.3',
'scikit-learn>=0.19.2',
'scipy>=1.1.0',
],
extras_require={
'tests': ['pytest',
Expand All @@ -29,8 +28,9 @@
'Intended Audience :: Developers',
'Intended Audience :: Education',
'Intended Audience :: Science/Research',
'License :: OSI Approved :: Apache-2.0 License',
'Programming Language :: Python :: 3.5',
'License :: OSI Approved :: MIT License',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3.6',
'Topic :: Software Development :: Libraries',
'Topic :: Software Development :: Libraries :: Python Modules'
],
Expand Down

0 comments on commit 4f99ae7

Please sign in to comment.