Backend module configuration parameters

You have three possibilities to configure the backend module. You can include them in the code via the main function, provide them as command-line arguments or use environment variables.

Note

Command line arguments take precedence over parameters to the main function which in turn takes precendece over environment variables!

Using the main function

When calling the main() function, you can pass provide various options to call for configuration (see the API reference).

Providing command line arguments

All parameters can also be provided via command line arguments.

usage: python my_module.py [-h] -t MESSAGING_CONFIG.TOPIC
                           [--header {"authorization": "Token ..."}]
                           [-m MODULE_NAME] [-d DOC] [--debug]
                           [-H PULSAR_CONFIG.HOST] [-p PULSAR_CONFIG.PORT]
                           [--persistent PULSAR_CONFIG.PERSISTENT]
                           [--tenant PULSAR_CONFIG.TENANT]
                           [--namespace PULSAR_CONFIG.NAMESPACE]
                           [--max-workers MESSAGING_CONFIG.MAX_WORKERS]
                           [--queue_size MESSAGING_CONFIG.QUEUE_SIZE]
                           [--max_payload_size MESSAGING_CONFIG.MAX_PAYLOAD_SIZE]
                           [--producer-keep-alive MESSAGING_CONFIG.PRODUCER_KEEP_ALIVE]
                           [--producer-connection-timeout MESSAGING_CONFIG.PRODUCER_CONNECTION_TIMEOUT]
                           [--websocket-url WEBSOCKETURL_CONFIG.WEBSOCKET_URL]
                           [--producer-url WEBSOCKETURL_CONFIG.PRODUCER_URL]
                           [--consumer-url WEBSOCKETURL_CONFIG.CONSUMER_URL]
                           [--members member [member ...]]
                           [--log-config LOG_CONFIG.CONFIG_FILE]
                           [--log-level {10,20,30,40,50}]
                           [--log-file LOG_CONFIG.LOGFILE]
                           [--merge-log-config]
                           [--log-overrides LOG_CONFIG.CONFIG_OVERRIDES]
                           {test-connect,listen,schema,send-request,compute,shell,generate}
                           ...

Named Arguments

-m, --module

Name of the backend module. Default: ‘__main__’

Default: '__main__'

-d, --description

The documentation of the object. If empty, this will be taken from the corresponding __doc__ attribute.

Default: ''

--debug

Run the backend module in debug mode (creates more verbose error messages).

Default: False

--members

List of members for this module Default: []

Default: []

Connection options

General options for the backend module

-t, --topic

The topic identifier under which to register at the pulsar. Default: ‘__NOTSET’ This option is required!

Default: '__NOTSET'

--header

Header parameters for the request Default: {}

Default: {}

--max-workers

(optional) number of concurrent workers for handling requests, default: number of processors on the machine, multiplied by 5.

--queue_size

(optional) size of the request queue, if MAX_WORKERS is set, this needs to be at least as big as MAX_WORKERS, otherwise an AttributeException is raised.

--max_payload_size

(optional) maximum payload size, must be smaller than pulsars ‘webSocketMaxTextFrameSize’, which is configured e.g.via ‘pulsar/conf/standalone.conf’.default: 512000 (500kb). Default: 512000

Default: 512000

--producer-keep-alive

The amount of time that the websocket connection to a producer should be kept open. By default, 2 minutes (120 seconds). On each outgoing message, the timer will be reset. Set this to 0 to immediately close the connection when a message has been sent and acknowledged. Default: 120

Default: 120

--producer-connection-timeout

The amount of time that we grant producers to establish a connection to the message broker in order to send a response. If a connection cannot be established in this time, the response will not be sent and the connection will be closed. Default: 30

Default: 30

Pulsar connection options

Arguments for connecting to a pulsar. This is the default connection method unless you specify a websocket-url (see below).

-H, --host

The remote host of the pulsar. Default: ‘localhost’

Default: 'localhost'

-p, --port

The port of the pulsar at the given host. Default: ‘8080’

Default: '8080'

--persistent

Default: ‘non-persistent’

Default: 'non-persistent'

--tenant

Default: ‘public’

Default: 'public'

--namespace

Default: ‘default’

Default: 'default'

Websocket URL group

Arguments for connecting to an arbitrary websocket service.

--websocket-url

The fully qualified URL to the websocket.

Default: ''

--producer-url

An alternative URL to use for producers. If None, the websocket_url will be used.

Default: ''

--consumer-url

An alternative URL to use for consumers. If None, the websocket_url will be used.

Default: ''

Logging Configuration group

Arguments for configuring the logging within DASF.

--log-config

Path to the logging configuration. Default: /home/docs/checkouts/readthedocs.org/user_builds/dasf/checkouts/latest/demessaging/config/logging.yaml

Default: /home/docs/checkouts/readthedocs.org/user_builds/dasf/checkouts/latest/demessaging/config/logging.yaml

--log-level

Possible choices: 10, 20, 30, 40, 50

Level for the logger. Setting this will override any levels specified in the logging config file. The lower the value, the more verbose the logging. Typical levels are 10 (DEBUG), 20 (INFO), 30 (WARNING), 40 (ERROR) and 50 (CRITICAL).

--log-file

A path to use for logging. If this is specified, we will add a RotatingFileHandler that loggs to the given path and add this handler to any logger in the logging config.

--merge-log-config

If this is True, the specified logging configuration file will be merged with the default one at /home/docs/checkouts/readthedocs.org/user_builds/dasf/checkouts/latest/demessaging/config/logging.yaml

Default: False

--log-overrides

Any valid YAML string, YAML file or JSON file that is merged into the logging configuration. This option can be used to quickly override some default logging functionality.

Commands

command

Possible choices: test-connect, listen, schema, send-request, compute, shell, generate

Sub-commands

test-connect

Connect the backend module to the pulsar message handler.

python my_module.py test-connect [-h]

listen

Connect the backend module to the pulsar message handler.

python my_module.py listen [-h] [--dump-to LISTEN_CONFIG.DUMP_TO]
                           [--dump-tool LISTEN_CONFIG.DUMP_TOOL]
                           [-c LISTEN_CONFIG.CMD]
Named Arguments
--dump-to

Instead of processing the request, dump it as a file to the given location. If you need further customization, use --dump-tool.

--dump-tool

Instead of using --dump-to, use this option to run a specific command for each request. We will first create a temporary file and then run this command as subprocess. This parameter requires –dump-to and two curly brackets ({}) in the argument that specify where to insert the target path. Or use {path} or {basename} or {directory} for more explicit control in your command.

If you want to process the dumped file further, combine this option with --cmd

Examples

Copy the request to a given location via rsync:

--dump-tool 'rsync {} .'

Copy the request via SSH to another server:

--dump-tool 'scp {} user@machine:/some/folder/'

Print the request to stdout and delete the temporary file:

--dump-tool 'cat {path} && rm {path}'
-c, --cmd

Instead of processing the request here, dump the request as file to the disc and run the dedicated command. The specified command must contain two curly braces ({}) that will be replaced with the path or basename of th file. Or use {path}, or {basename} or {directory} for more explicit control in your command.

Examples

Cat the request (i.e. always return the input to the sender):

--cmd 'cat {}'

Copy the file via scp and run some command to process it on a remote machine:

--dump-tool 'scp {} user@machine:/some/folder/' --cmd 'some-command /some/folder/{basename}'

schema

Print the schema for the backend module.

python my_module.py schema [-h] [-i INDENT]
Named Arguments
-i, --indent

Indent the JSON dump.

send-request

Test a request via the pulsar messaging system.

python my_module.py send-request [-h] request
Positional Arguments
request

A JSON-formatted file with the request.

compute

Process a JSON-formatted file as request.

python my_module.py compute [-h] request
Positional Arguments
request

A JSON-formatted file with the request.

shell

This command starts an IPython shell where you can access and work with the generated pydantic Model. This model class is available via the Model variable in the shell.

python my_module.py shell [-h]

generate

This command generates an API module that connects to the backend module via the pulsar and can be used on the client side. We use isort and black to format the generated python file.

python my_module.py generate [-h] [-l LINE_LENGTH] [--no-formatters]
                             [--no-isort] [--no-black] [--no-autoflake]
Named Arguments
-l, --line-length

The line-length for the output API. Default: 79

Default: 79

--no-formatters

Do not use any formatters (isort, black or autoflake) for the generated code.

Default: True

--no-isort

Do not use isort for formatting.

Default: True

--no-black

Do not use black for formatting.

Default: True

--no-autoflake

Do not use autoflake for formatting.

Default: True

Using environment variables

Parameters for the messaging config (i.e. for PulsarConfig and WebsocketURLConfig) can also be provided via environment variables. Add the following prefix to the exported parameter: DE_BACKEND_, e.g. DE_BACKEND_HOST to set the host parameter.

The same works for the logging setup. The LoggingConfig class listens to environment variables prefixed with DE_LOGGING_ (e.g. DE_LOGGING_LEVEL).