Looking for best, future-proof, dictionary format to store all my static glossaries in (i.e. ones no longer changing or being updated) #538
Replies: 4 comments 1 reply
-
I think ZIM files is the best choice. You can create them by zimwriterfs tool from https://github.com/openzim/zim-tools. I hope GD-ng will use libzim in the future. :) |
Beta Was this translation helpful? Give feedback.
-
Interesting, I have no experience with the ZIM format. Will look into it. |
Beta Was this translation helpful? Give feedback.
-
@michaelbeijer You can download the Wiktionary/Wikipedia in any language in .zim format for GoldenDict: https://wiki.kiwix.org/wiki/Content_in_all_languages Another advice: you can find many dictionaries on FreeMDict Forum. For example: https://forum.freemdict.com/t/topic/12050 |
Beta Was this translation helpful? Give feedback.
-
Thanks everyone! Although the ZIM format looks interesting, I'm afraid it's a bit too complicated for me, in the sense of there not being any Windows-based GUI tools. I'm currently wondering about the following: With a view to opening GoldenDict and having it scan a huuuuuuge folder of dictionaries, what is better: ? The biggest amount of time and CPU cycles seems to be used by ‘Indexing for full-text search’. However, this only needs to be done once for the whole collection, and then updated only when individual dictionaries are added. I am currently testing the nonwill vs xiaoyifang vs official versions in terms of how long it takes the initial indexing, and how much CPU/memory is used. I'm using the latest daily Xapian version of the xiaoyifang fork, a version of the official GD found on a Russian forum, and nonwill's "GoldenDict++OCR-3E18-Qt-5.9.9-p3-msvc-16.11.25-Windows-x64-20230321". |
Beta Was this translation helpful? Give feedback.
-
So, I have a bit of a general question regarding how to set up using GoldenDict-ng for an exceptionally large collection of dictionaries. I aim to manage/search hundreds, if not thousands, of dictionaries over time in GoldenDict-ng.
After lots of research and testing, I am fairly certain I will use the .dsl format to keep track of my own actively-changing dictionary, which I add to while translating. I like that it is a simple text format and I can work on it in my text editor. However, I am wondering what kind of performance I will get once I start to work with several hundreds of DSL dictionaries in the program. The initial scan, or rescanning after moving my dictionaries around can easily start to take quite long. So my idea was to use the DSL format only for the few dictionaries I am actively working on. However, for all the rest, i.e., dictionaries I find online etc., I would store them in a different format, which GoldenDict-ng can scan quicker. I also want to be able to do full text search on all my dictionaries. So my question is: what would be a better format for storing the bulk of these unchanging dictionaries?
I have basically been studying the very useful list of formats available @ https://github.com/ilius/pyglossary (see: ‘Supported formats’).
It's not clear to me how indexing (for full text search & search term search) in GoldenDict-ng relates how some of these formats apparently already contain their own index.
I'm basically looking for the best, future-proof, dictionary format to store all my static glossaries in (i.e. the ones that are no longer changing or being updated), for use in GoldenDict-ng, allowing the fastest possible startup/rescanning times, while still offering optimised search/indexing. If anyone here could explain this I would be very grateful!
Beta Was this translation helpful? Give feedback.
All reactions