From 4ff3a27de309f6688facb7a547b297f4bed2d4b9 Mon Sep 17 00:00:00 2001 From: Tue Nguyen Date: Tue, 19 Apr 2022 22:33:47 -0400 Subject: [PATCH 1/4] Fix legacy backend/parser related docs --- src/api/parser/README.md | 18 ++++++++++-------- src/web/docusaurus/docs/api-services/parser.md | 13 +++++++++++++ src/web/docusaurus/docs/architecture.md | 16 ++-------------- .../docusaurus/docs/contributing/debugging.md | 1 - .../docs/getting-started/environment-setup.md | 2 -- .../docs/tools-and-technologies/pino.md | 4 ++-- 6 files changed, 27 insertions(+), 27 deletions(-) create mode 100644 src/web/docusaurus/docs/api-services/parser.md diff --git a/src/api/parser/README.md b/src/api/parser/README.md index 3a5e3a8686..778c4c8115 100644 --- a/src/api/parser/README.md +++ b/src/api/parser/README.md @@ -1,6 +1,12 @@ -# Parser Service: (To be updated when Parser service is dockerized and live) +# Parser Service: -The Parser service parses posts from user's feeds to populate Redis +The current system uses the parser service in order to run the feed parser and feed queue, see [`./data/feed.js`](./src/data/feed.js). The blog feeds are stored in Supabase database, they are fetched to be loaded into a [`queue`](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/lib/queue.js) to create [`Feed`](./src/data/feed.js) and [`Post`](./src/data/post.js) objects to be stored in `Redis` (cache) and `Elasticsearch` (indexing) databases, and various microservices use these in order to get their data. + +Telescope's data model is built on Feeds and Posts. A feed represents an RSS/Atom feed, and includes metadata about a particular blog (e.g., URL, author, etc) as well as URLs to individual Posts. A Post includes metadata about a particular blog post (e.g., URL, date created, date updated, etc). + +To run the service, use command `pnpm services:start parser` or `pnpm services:start` or `pnpm dev` in `src/api/parser`. When it runs, the logs show information about feeds being parsed in real-time, which continues forever. + +The parser get all the feed urls and authors from Supabase database, parses them, creates `Feed` objects and puts them into a queue managed by [Bull](https://github.com/OptimalBits/bull) and backed by `Redis`. These are then processed in [`./src/feed/processor.js`](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/feed/processor.js) in order to download the individual Posts, which are also cached in Redis. ## Install @@ -10,7 +16,7 @@ pnpm install ## Usage -### Normal mode +### Docker mode ``` pnpm start @@ -22,8 +28,4 @@ pnpm start pnpm dev ``` -By default the server is running on . - -### Examples - -## Docker +By default the server is running on http://localhost:10000/. diff --git a/src/web/docusaurus/docs/api-services/parser.md b/src/web/docusaurus/docs/api-services/parser.md new file mode 100644 index 0000000000..c6f1719e96 --- /dev/null +++ b/src/web/docusaurus/docs/api-services/parser.md @@ -0,0 +1,13 @@ +--- +sidebar_position: 6 +--- + +# Parser Service: + +The current system uses the parser service in order to run the feed parser and feed queue, see [`src/api/parser/data/feed.js`](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/data/feed.js). The blog feeds are stored in Supabase database, they are fetched to be loaded into a [`queue`](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/lib/queue.js) as jobs to be processed to create [`Feed`](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/data/feed.js) and [`Post`](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/data/post.js) objects to be stored in `Redis` (cache) and `Elasticsearch` (indexing) databases, and various microservices use these in order to get their data. + +Telescope's data model is built on Feeds and Posts. A feed represents an RSS/Atom feed, and includes metadata about a particular blog (e.g., URL, author, etc) as well as URLs to individual Posts. A Post includes metadata about a particular blog post (e.g., URL, date created, date updated, etc). + +To run the service, use command `pnpm services:start parser` or `pnpm services:start` or `pnpm dev` in `src/api/parser`. When it runs, the logs show information about feeds being parsed in real-time, which continues forever. + +The parser get all the feed urls and authors from Supabase database, parses them, creates `Feed` objects and puts them into a queue managed by [Bull](https://github.com/OptimalBits/bull) and backed by `Redis`. These are then processed in [`src/api/parser/feed/processor.js`](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/feed/processor.js) in order to download the individual Posts, which are also cached in Redis. diff --git a/src/web/docusaurus/docs/architecture.md b/src/web/docusaurus/docs/architecture.md index ab71a16f41..d50611c7f5 100644 --- a/src/web/docusaurus/docs/architecture.md +++ b/src/web/docusaurus/docs/architecture.md @@ -27,19 +27,7 @@ The following gives an overview of the current (i.e. 2.4.0) design of Telescope. ### Legacy Monolithic Back-end (1.0) -Telescope's back-end began as a single, monolithic node.js app. During the 2.0 release, much of the back-end was split into separate microservices (see below). However, parts of the legacy back-end code are still in use, see `src/backend/*`. - -The current system uses the legacy backend in order to run the feed parser and feed queue, see `src/backend/feed/*`. The processed feeds and posts are then stored in Redis (cache) and Elasticsearch (indexing) databases, and various microservices use these in order to get their data. - -Telescope's data model is built on Feeds and Posts. A feed represents an RSS/Atom feed, and includes metadata about a particular blog (e.g., URL, author, etc) as well as URLs to individual Posts. A Post includes metadata about a particular blog post (e.g., URL, date created, date updated, etc). - -The legacy back-end is started using `pnpm start` in the root of the Telescope monorepo, and it (currently) must be run alongside the microservices. When it runs, the logs show information about feeds being parsed in real-time, which continues forever. - -The parser downloads the [CDOT Feed List](https://wiki.cdot.senecacollege.ca/wiki/Planet_CDOT_Feed_List#Feeds), parses it, creates `Feed` objects and puts them into a queue managed by [Bull](https://github.com/OptimalBits/bull) and backed by Redis. These are then processed in `src/backend/feed/processor.js` in order to download the individual Posts, which are also cached in Redis. - -There is code duplication between the current back-end and the Parser microservice (see `src/api/parser`), and anyone changing the back-end will also need to update the Parser service at the same time (for now). One of the 3.0 goals is to [remove the back-end and move all of this logic to the Parser service](https://github.com/Seneca-CDOT/telescope/issues?q=is%3Aissue+is%3Aopen+parser+service). - -In production, the legacy back-end is deployed as a container named `telescope` (see `docker/production.yml`), and its Dockerfile lives in the root at `./Dockerfile`. +Telescope's back-end began as a single, monolithic node.js app. During the 2.0 release, much of the back-end was split into separate microservices (see below). However, parts of the legacy back-end code were still in use (see [`src/backend`](https://github.com/Seneca-CDOT/telescope/tree/d780d630abdd903b55a2a645b0f98ee96554e434/src/backend)) but they were eventually replaced by the `parser` service during 3.0 release. ### Back-end Microservices (2.0) @@ -52,7 +40,7 @@ The legacy back-end has been split into a series of microservices. Each microser - Posts Service (`src/api/posts`) - API for accessing Post and Feed data in Redis (probably not well named at this point) - Search Service (`src/api/search`) - API for doing searches against Elasticsearch - Status Service (`src/api/status`) - API for accessing Telescope status information, as well as providing the Dashboards -- Parser Service (`src/api/parser`) - feed and post parsing. Currently disabled, see +- Parser Service (`src/api/parser`) - feed and post parsing which was disabled in 2.0 but now enabled. All microservices are built on a common foundation, the [Satellite module](https://github.com/Seneca-CDOT/telescope/tree/master/src/satellite). Satellite provides a common set of features for building Express-based microservices, with proper logging, health checks, headers, authorization middleware, as well as connections to Redis and Elasticsearch. It saves us having to manage the same set of dependencies a dozen times, and repeat the same boilerplate code. diff --git a/src/web/docusaurus/docs/contributing/debugging.md b/src/web/docusaurus/docs/contributing/debugging.md index 0c21c7c196..99f4d1daef 100644 --- a/src/web/docusaurus/docs/contributing/debugging.md +++ b/src/web/docusaurus/docs/contributing/debugging.md @@ -32,7 +32,6 @@ Now that we know how to launch the server. We will look at different launching o ![VS Launch Options Screenshot](../../static/img/VS-Launch-Options.png) -1. Launch Telescope -> Launches index.js and tries to run the backend. 1. Launch Auto Deployment -> Launches autodeployment found in tools. 1. Launch All Tests -> Runs all the tests in the tests folder. 1. Launch Opened Test File -> Will run a test that is currently opened in a VS code tab. diff --git a/src/web/docusaurus/docs/getting-started/environment-setup.md b/src/web/docusaurus/docs/getting-started/environment-setup.md index 0d9dac1f4f..aaf14b5afd 100644 --- a/src/web/docusaurus/docs/getting-started/environment-setup.md +++ b/src/web/docusaurus/docs/getting-started/environment-setup.md @@ -260,8 +260,6 @@ This is the default setting, you do not need to copy or modify any `env` file. ```bash pnpm services:start - -pnpm start ``` Then visit `localhost:8000` in a web browser to see Telescope running locally. `localhost:3000/posts` will show you the list of posts in JSON diff --git a/src/web/docusaurus/docs/tools-and-technologies/pino.md b/src/web/docusaurus/docs/tools-and-technologies/pino.md index bc7e1619e0..769914cf0f 100644 --- a/src/web/docusaurus/docs/tools-and-technologies/pino.md +++ b/src/web/docusaurus/docs/tools-and-technologies/pino.md @@ -4,12 +4,12 @@ sidebar_position: 5 # Logging Support Using Pino -This project uses [Pino](http://getpino.io/#/) to provide support for logging in Production as well as development environments. The [logger.js](https://github.com/Seneca-CDOT/telescope/blob/master/src/backend/utils/logger.js) module exports a logger instance that can be used in other modules to implement logging for important events. +This project uses [Pino](http://getpino.io/#/) to provide support for logging in Production as well as development environments. `Satellite` exports a [`logger`](https://github.com/Seneca-CDOT/telescope/blob/master/src/satellite/src/logger.js) instance that can be used in other modules to implement logging for important events. ## How to use the logger ```javascript -const { logger } = require('../src/backend/utils/logger'); +const { logger } = require('@senecacdot/satellite'); logger.info('Important information...'); logger.trace('Information About Trace'); From 54818eba8a9841bf403cc9e83d14b6c7ce9b19c8 Mon Sep 17 00:00:00 2001 From: Tue Nguyen Date: Tue, 19 Apr 2022 22:35:28 -0400 Subject: [PATCH 2/4] Add minor fix to architecture.md --- src/web/docusaurus/docs/architecture.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/web/docusaurus/docs/architecture.md b/src/web/docusaurus/docs/architecture.md index d50611c7f5..72fdbfc6dc 100644 --- a/src/web/docusaurus/docs/architecture.md +++ b/src/web/docusaurus/docs/architecture.md @@ -40,7 +40,7 @@ The legacy back-end has been split into a series of microservices. Each microser - Posts Service (`src/api/posts`) - API for accessing Post and Feed data in Redis (probably not well named at this point) - Search Service (`src/api/search`) - API for doing searches against Elasticsearch - Status Service (`src/api/status`) - API for accessing Telescope status information, as well as providing the Dashboards -- Parser Service (`src/api/parser`) - feed and post parsing which was disabled in 2.0 but now enabled. +- Parser Service (`src/api/parser`) - feed and post parsing which was disabled in 2.0, but now enabled in 3.0 release. All microservices are built on a common foundation, the [Satellite module](https://github.com/Seneca-CDOT/telescope/tree/master/src/satellite). Satellite provides a common set of features for building Express-based microservices, with proper logging, health checks, headers, authorization middleware, as well as connections to Redis and Elasticsearch. It saves us having to manage the same set of dependencies a dozen times, and repeat the same boilerplate code. From 3a2783ee9d1dd74fac620d19b9add977378f901e Mon Sep 17 00:00:00 2001 From: Tue Nguyen Date: Wed, 20 Apr 2022 11:56:24 -0400 Subject: [PATCH 3/4] Rephrase --- src/api/parser/README.md | 2 +- src/web/docusaurus/docs/api-services/parser.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/api/parser/README.md b/src/api/parser/README.md index 778c4c8115..6d874d0898 100644 --- a/src/api/parser/README.md +++ b/src/api/parser/README.md @@ -1,6 +1,6 @@ # Parser Service: -The current system uses the parser service in order to run the feed parser and feed queue, see [`./data/feed.js`](./src/data/feed.js). The blog feeds are stored in Supabase database, they are fetched to be loaded into a [`queue`](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/lib/queue.js) to create [`Feed`](./src/data/feed.js) and [`Post`](./src/data/post.js) objects to be stored in `Redis` (cache) and `Elasticsearch` (indexing) databases, and various microservices use these in order to get their data. +The current system uses the parser service in order to run the feed parser and feed queue, see [`./data/feed.js`](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/data/feed.js). The blog feeds are stored into a supabase database. They are fetched and loaded into a [queue](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/lib/queue.js) to create [Feed](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/data/feed.js) and [Post](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/data/post.js) objects so it can be stored into `Redis` (cache) and `Elasticsearch` (indexing) database. Afterwards, various microservices can use them to request data. Telescope's data model is built on Feeds and Posts. A feed represents an RSS/Atom feed, and includes metadata about a particular blog (e.g., URL, author, etc) as well as URLs to individual Posts. A Post includes metadata about a particular blog post (e.g., URL, date created, date updated, etc). diff --git a/src/web/docusaurus/docs/api-services/parser.md b/src/web/docusaurus/docs/api-services/parser.md index c6f1719e96..397c25f841 100644 --- a/src/web/docusaurus/docs/api-services/parser.md +++ b/src/web/docusaurus/docs/api-services/parser.md @@ -4,7 +4,7 @@ sidebar_position: 6 # Parser Service: -The current system uses the parser service in order to run the feed parser and feed queue, see [`src/api/parser/data/feed.js`](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/data/feed.js). The blog feeds are stored in Supabase database, they are fetched to be loaded into a [`queue`](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/lib/queue.js) as jobs to be processed to create [`Feed`](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/data/feed.js) and [`Post`](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/data/post.js) objects to be stored in `Redis` (cache) and `Elasticsearch` (indexing) databases, and various microservices use these in order to get their data. +The current system uses the parser service in order to run the feed parser and feed queue, see [`./data/feed.js`](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/data/feed.js). The blog feeds are stored into a supabase database. They are fetched and loaded into a [queue](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/lib/queue.js) to create [Feed](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/data/feed.js) and [Post](https://github.com/Seneca-CDOT/telescope/blob/master/src/api/parser/src/data/post.js) objects so it can be stored into `Redis` (cache) and `Elasticsearch` (indexing) database. Afterwards, various microservices can use them to request data. Telescope's data model is built on Feeds and Posts. A feed represents an RSS/Atom feed, and includes metadata about a particular blog (e.g., URL, author, etc) as well as URLs to individual Posts. A Post includes metadata about a particular blog post (e.g., URL, date created, date updated, etc). From 00b5420ac76a8d51c298d1bb9d6b88493f4f89ae Mon Sep 17 00:00:00 2001 From: Tue Nguyen Date: Fri, 22 Apr 2022 17:50:43 -0400 Subject: [PATCH 4/4] Fix sidebar pos --- src/web/docusaurus/docs/api-services/parser.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/web/docusaurus/docs/api-services/parser.md b/src/web/docusaurus/docs/api-services/parser.md index 397c25f841..02a8ab23ab 100644 --- a/src/web/docusaurus/docs/api-services/parser.md +++ b/src/web/docusaurus/docs/api-services/parser.md @@ -1,5 +1,5 @@ --- -sidebar_position: 6 +sidebar_position: 8 --- # Parser Service: