Recommendation: three-tier architecture #41

jar398 · 2016-12-13T20:42:50Z

Writeup as requested by @kcranston

In designing a possible integration of phylesystem-api and otindex, I recommend following the multitier idea (https://en.wikipedia.org/wiki/Multitier_architecture), other things being equal. I used to just think that this was just computer industry BS, but have come to see the logic behind it.

"a client–server architecture in which presentation, application processing, and data management functions are physically separated"

For Open Tree, the data management functions are:

the github repo clone access and update functions that are managed by peyotl
the supplementary uploaded file set currently managed by the webapp

Application processing includes all of our cache-like databases and the web services on top of them: OTI (or otindex), taxomachine, treemachine, parts of phylesystem-api (?), conflict service. Note that as currently imagined otindex does not do data management; it is just a cache.

Presentation of course is the webapp.

Data management is characterized by being centralized and uncached. (Of course there is such a thing as a distributed database but we are nowhere close to making use of such a thing.) It is reponsible for updating the data, not just reading it. The physical instantiation of the data management tier is unreplicated. It wants to be as lean as possible because it is going to be hit a lot and there are limited opportunities for making it faster.

The application processing (API) can be replicated, since it is not in the business of taking care of the 'truth' of the data (update, consistency, and so on). It has caches of the data - but that's a completely different story. Caching is not data management, it is data use (application processing).

The key word here is physically separated. 3-tier does not imply that the code be in separate repos. It just says the three functions should be physically separate, once deployed.

The advantages are not just in performance (replication) but in robustness - you can crash, reboot, test etc. application processing servers without threatening the data management server (and therefore the data itself).

I think it would be nice if 1 and 2 were eventually on the same server although our file upload set it write-once so it is considerably less sensitive than the phylesystem. (I'll make a tandem opentree issue.)

This is just a recommendation. If we decide not to do it, nothing much will change; all we do is make it harder, down the road, to replicate the application processing logic, should we ever want to or need to. That is, by intermixing data management and application processing in new code, we miss an opportunity to clean up the architecture, and we run up technical debt.

I'm not saying replication should be implemented. I'm just saying that it might be wise to avoid design decisions in new work that make 3-tier harder in the future.

jar398 mentioned this issue Dec 13, 2016

File uploads belong in the data management tier, not the presentation tier OpenTreeOfLife/opentree#1113

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recommendation: three-tier architecture #41

Recommendation: three-tier architecture #41

jar398 commented Dec 13, 2016

Recommendation: three-tier architecture #41

Recommendation: three-tier architecture #41

Comments

jar398 commented Dec 13, 2016