chgov-brprotokolle-server

chgov-brprotokolle

Context

The chgov-brprotokolle project is settled around managing, retrieving and displaying historic minutes of the Federal Council, based on the IIIF standard and the live project set-up can be experienced over at Federal Archives's site. The project is separated into 4 dedicated repositories while this current repository chgov-brprotokolle-server is the backend for the ingestion of minutes and the interface for SOLR search requests. It was developed using TypeScript and is based on the archival-iiif-server. The other projects include the publicly accessible frontend (chgov-brprotokolle-frontend), a frontend utility to properly enable OCR display in Mirador (chgov-brprotokolle-mirador-ocr-helper) and documentation chgov-brprotokolle-markdown. The frontend is written in React and the frontend utility in plain JavaScript.

Architecture and components

The backend server has two major tasks: ingestion and search routing. The latter is more or less directly passed to the corresponding SOLR instance and it's objective is to provide an interface for queries.. The former is outlined below with its objective to store data in the SOLR instance and create IIIF representations of the minutes.

Pipeline Ingestion

The ingestion pipline handles either handwritten minutes (e.g. with provided OCR from the Transkribus project) or machine written minutes (e.g. as PDF files, no OCR provided), enhances the minutes with provided metadata and ultimately stores relevant information in a SOLR instance. In order to start the ingestion, files in the appropriate format have to be added to the HOTFOLDER, which the dirWatcher catches. Then, depending on the type of minutes the collectionBuilder handles handwritten minutes for further processing. Machine written minutes are ingested as single PDFs, thus before further processing, the images have to be extracted (imgExtractor) and subsequently, OCR is extracted based on the images (ocrExtractor). At this point, the images, ocr data and metadata are provided and there is no distinction between machine written and handwritten anymore. The ocr data is compiled into a single text file, the ocr plaintext and together with the images, and, ocr data, it's stored under the DATAFOLDER directory. The metadata and known locations of the images, ocr data, and, ocr plaintext are used to generate the IIIF manifests (manifestCreate). These manifests are delivered by an external webserver and are not further part of the backend project. The pipeline is built in such a way that the solrAdd step finalises the ingestion and adds relevant information to the SOLR instance.

First steps

Preparations

To prepare the backend server's setup, it is mandatory to have a running SOLR instance, prepared with the appropriate schema and plugin.

Install

Installation of the development enviornment is done by calling npm install, as this is a node project.

Customization

General

Custom elements for the pipeine can be added as described in the archival-iiif-server documentation.

Run tests

There aren't any automated tests available. End to end runs have to be manually checked.

Authors

License

GNU Affero General Public License (AGPLv3), see LICENSE

Contribute

This repository is a copy which is updated regularly - therefore contributions via pull requests are not possible. However, independent copies (forks) are possible under consideration of the The MIT license.

Contact

For general questions (and technical support), please contact the Swiss Federal Archives by e-mail at [email protected].
Technical questions or problems concerning the source code can be posted here on GitHub via the "Issues" interface.

Name		Name	Last commit message	Last commit date
Latest commit History 324 Commits
doc/images		doc/images
docs		docs
solr		solr
src		src
test		test
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.nvmrc		.nvmrc
.test.env		.test.env
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
docker-compose.yml.example		docker-compose.yml.example
iiif-server.sh		iiif-server.sh
ingest.py		ingest.py
nodemon.json		nodemon.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
untyped.d.ts		untyped.d.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

chgov-brprotokolle-server

Context

Architecture and components

Pipeline Ingestion

First steps

Preparations

Install

Customization

General

Run tests

Authors

License

Contribute

Contact

About

Releases

Packages

Contributors 5

Languages

License

SwissFederalArchives/chgov-brprotokolle-server

Folders and files

Latest commit

History

Repository files navigation

chgov-brprotokolle-server

Context

Architecture and components

Pipeline Ingestion

First steps

Preparations

Install

Customization

General

Run tests

Authors

License

Contribute

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages