Let's start with some factors we should consider in picking the right search engine for our use case.
- The size of each row / item
- The size of the entire dataset
- If the dataset is fixed or continuously growing
- If the dataset row / item is immutable or mutable
- If the dataset row / item has a consistent schema / structure
- If search-as-you-type experience is needed
- The amount of parsing, validating, cleaning, transforms
- API & Documentation
- Community & Consultation (for support)
- Ease of Setup / Installation
- Ease of Creating / Updating / Deleting Data
- Operating Costs in CPU, ram, disk, network bandwidth
- Ease of scaling up (vertically), by adding more CPU, ram, disk
- Ease of scaling out (horizontally), by adding more server instances
- Search speed
- Search accuracy
- Search sorting and filtering
I preferred not to compare them feature-by-feature because they tend to improve quite frequently. But I've put some highlights, links, and examples.
- Highlights: disk-based with ram-cache, scales well.
- Example: judyrecords.com
- Highlights: ram-based, scales well.
- Examples: typesense.org/#showcase
- Highlights: S3-compatibility, designed for immutable datasets, scales well I guess.
- Example: common-crawl.quickwit.io
- Highlights: disk-based with memory-mapped files.
- Example: docs.meilisearch.com/learn/what_is_meilisea..
- Highlights: used by crisp.chat helpdesk.
- Example: github.com/valeriansaliou/sonic#demo