Database-Server
The Database-Server package provides server side implementation of Database-API. It is a collection of wrapped retrieval methods.
database-server
Package
The database-server
Package contains only a placeholder database_app
Flask application at the moment.
Deploying
Use the wheel:
pip install database-server --index-url https://gitlab.fachschaften.org/api/v4/projects/2724/packages/pypi/simple
Or install the package directly:
python3 -m pip install .
This currently installs two executables in the default python bin
directory:
-
database-server
is the server application -
database-server-opensearch
for direct OpenSearch interfacing
Project setup
Services
All services used by the database-server are defined in the docker-compose.yml
file.
Make sure to edit .env
to contain your
HuggingFace API token, since the jinaai
models
require authorization.
You can start the services all together with:
cd database-server-services
docker compose --profile cpu up -d # or --profile cuda
NOTE: This will load some large models via TEI into memory.
To only start selected services, run something like:
cd database-server-services
docker compose up tei-jinaai-jina-embedding-v2-small-en-cpu opensearch-node1 -d
Testing the API manually
Start the server by running:
poetry run python database_server/database_app.py
Use a tool like Postman to send requests without much effort.
Send requests to http://localhost:8080/retrieval
with content type application/json
.
Example Body for the Generic retriever:
{
"query": "My Query",
"retrieverType": {
"name": "Generic"
},
"maxLength": 50
}
Example Body for the OpenSearchTerm retriever (requires and existing index1 and index2):
{
"query": "Query",
"retrieverType": {
"name": "OpenSearchTerm",
"args": {
"operator": "or",
"indices": [
"index1",
"index2"
],
"minimum_should_match": 1
}
},
"maxLength": 50
}
Dependencies
The dependencies can be managed with the lean specification in requirements.txt
,
optionally poetry can be used to use explicit torch+cpu
versions (seems to be a workaround for non-cuda systems).
.venv
Setup and activate a venv:
python -m venv .venv
source .venv/bin/activate
Install the project/dependencies:
pip install -r requirements.txt
pip install -r requirements-dev.txt # optional
pip install -e . # Setup project for local usage
This will prompt you to enter your gitlab credentials. If you use 2FA, genearate
a Personal Access Token. You can store your credentials in your .pypirc
file:
[gitlab]
repository = https://gitlab.fachschaften.org/api/v4/projects/2744/packages/pypi
username = __token__
password = <your personal access token>
Poetry
You can simply use the devcontainer or install poetry yourself:
- Official Installer or
- Local pip installation:
pip install poetry
Add credentials for the private Database-API package registry:
poetry config http-basic.database_api <your-user-name> <your-access-token-or-password>
Create the poetry env:
poetry install --with=dev
Create a shell for the environment:
poetry shell
pre-commit
poetry run pre-commit install
poetry run pre-commit install --hook-type commit-msg
Known Issues
Error | Fix |
---|---|
segmentation fault (core dumped) when loading pix2text |
Disable pytorch JIT by setting the PYTORCH_JIT=0 environment variable. |