API Docs¶
Client¶
-
class
skein.
Client
(address=None, security=None, keytab=None, principal=None, log=None, log_level=None, java_options=None)¶ Connect to and schedule applications on the YARN cluster.
- Parameters
- addressstr, optional
The address for the driver. By default will create a new driver process. Pass in address explicitly to connect to a different driver. To connect to the global driver see
Client.from_global_driver
.- securitySecurity, optional
The security configuration to use to communicate with the driver. Defaults to the global configuration.
- keytabstr, optional
Path to a keytab file to use when starting the driver. If not provided, the driver will login using the ticket cache instead.
- principalstr, optional
The principal to use when starting the driver with a keytab.
- logstr, bool, or None, optional
When starting a new driver, sets the logging behavior for the driver. Values may be a path for logs to be written to,
None
to log to stdout/stderr, orFalse
to turn off logging completely. Default isNone
.- log_levelstr or skein.model.LogLevel, optional
The driver log level. Sets the
skein.log.level
system property. One of {‘ALL’, ‘TRACE’, ‘DEBUG’, ‘INFO’, ‘WARN’, ‘ERROR’, ‘FATAL’, ‘OFF’} (from most to least verbose). Default is ‘INFO’.- java_optionsstr or list of str, optional
Additional Java options to forward to the driver. Can also be configured by setting the environment variable
SKEIN_DRIVER_JAVA_OPTIONS
.
Examples
>>> with skein.Client() as client: ... app_id = client.submit('spec.yaml')
-
application_logs
(app_id, user='')¶ Get logs from a completed skein application.
- Parameters
- app_idstr
The id of the application.
- userstr, optional
The user to get the application logs as. Requires the current user to have permissions to proxy as
user
. Default is the current user.
- Returns
- ApplicationLogslogs
A mapping of
yarn_container_id
tologs
for each container.
Examples
>>> client.application_logs('application_1526134340424_0012') ApplicationLogs<application_1526134340424_0012>
-
application_report
(app_id)¶ Get a report on the status of a skein application.
- Parameters
- app_idstr
The id of the application.
- Returns
- reportApplicationReport
Examples
>>> client.application_report('application_1526134340424_0012') ApplicationReport<name='demo'>
-
close
()¶ Release all resources used by this client. Idempotent.
Note that this closes the Java driver if it was started by this client.
-
connect
(app_id, wait=True, security=None)¶ Connect to a running application.
- Parameters
- app_idstr
The id of the application.
- waitbool, optional
If true [default], blocks until the application starts. If False, will raise a
ApplicationNotRunningError
immediately if the application isn’t running.- securitySecurity, optional
The security configuration to use to communicate with the application master. Defaults to the global configuration.
- Returns
- app_clientApplicationClient
- Raises
- ApplicationNotRunningError
If the application isn’t running.
-
classmethod
from_global_driver
()¶ Connect to the global driver.
-
get_all_queues
()¶ Get information about all queues in the cluster.
- Returns
- queueslist of Queue
Examples
>>> client.get_all_queues() [Queue<name='default', percent_used=0.00>, Queue<name='myqueue', percent_used=5.00>, Queue<name='child1', percent_used=10.00>, Queue<name='child2', percent_used=0.00>]
-
get_applications
(states=None, name=None, user=None, queue=None, started_begin=None, started_end=None, finished_begin=None, finished_end=None)¶ Get the status of current skein applications.
- Parameters
- statessequence of ApplicationState, optional
If provided, applications will be filtered to these application states. Default is
['SUBMITTED', 'ACCEPTED', 'RUNNING']
.- namestr, optional
Only select applications with this name.
- userstr, optional
Only select applications with this user.
- queuestr, optional
Only select applications in this queue.
- started_begindatetime or str, optional
Only select applications that started after this time (inclusive). Can be either a datetime or a string representation of one. String representations can use any of the following formats:
YYYY-M-D H:M:S
(e.g. 2019-4-10 14:50:20)YYYY-M-D H:M
(e.g. 2019-4-10 14:50)YYYY-M-D
(e.g. 2019-4-10)H:M:S
(e.g. 14:50:20, today is used for date)H:M
(e.g. 14:50, today is used for date)
- started_enddatetime or str, optional
Only select applications that started before this time (inclusive). Can be either a datetime or a string representation of one.
- finished_begindatetime or str, optional
Only select applications that finished after this time (inclusive). Can be either a datetime or a string representation of one.
- finished_enddatetime or str, optional
Only select applications that finished before this time (inclusive). Can be either a datetime or a string representation of one.
- Returns
- reportslist of ApplicationReport
Examples
Get all the finished and failed applications
>>> client.get_applications(states=['FINISHED', 'FAILED']) [ApplicationReport<name='demo'>, ApplicationReport<name='dask'>, ApplicationReport<name='demo'>]
Get all applications named ‘demo’ started after 2019-4-10:
>>> client.get_applications(name='demo', started_begin='2019-4-10') [ApplicationReport<name='demo'>, ApplicationReport<name='demo'>]
-
get_child_queues
(name)¶ Get information about all children of a parent queue.
- Parameters
- namestr
The parent queue name.
- Returns
- queueslist of Queue
Examples
>>> client.get_child_queues('myqueue') [Queue<name='child1', percent_used=10.00>, Queue<name='child2', percent_used=0.00>]
-
get_nodes
(states=None)¶ Get the status of nodes in the cluster.
- Parameters
- statessequence of NodeState, optional
If provided, nodes will be filtered to these node states. Default is all states.
- Returns
- reportslist of NodeReport
Examples
Get all the running nodes
>>> client.get_nodes(states=['RUNNING']) [NodeReport<id='worker1.example.com:34721'>, NodeReport<id='worker2.example.com:34721'>]
-
get_queue
(name)¶ Get information about a queue.
- Parameters
- namestr
The queue name.
- Returns
- queueQueue
Examples
>>> client.get_queue('myqueue') Queue<name='myqueue', percent_used=5.00>
-
kill_application
(app_id, user='')¶ Kill an application.
- Parameters
- app_idstr
The id of the application to kill.
- userstr, optional
The user to kill the application as. Requires the current user to have permissions to proxy as
user
. Default is the current user.
-
move_application
(app_id, queue)¶ Move an application to a different queue.
- Parameters
- app_idstr
The id of the application to move.
- queuestr
The queue to move the application to.
-
static
start_global_driver
(keytab=None, principal=None, log=None, log_level=None, java_options=None)¶ Start the global driver.
No-op if the global driver is already running.
- Parameters
- keytabstr, optional
Path to a keytab file to use when starting the driver. If not provided, the driver will login using the ticket cache instead.
- principalstr, optional
The principal to use when starting the driver with a keytab.
- logstr, bool, or None, optional
Sets the logging behavior for the driver. Values may be a path for logs to be written to,
None
to log to stdout/stderr, orFalse
to turn off logging completely. Default isNone
.- log_levelstr or skein.model.LogLevel, optional
The driver log level. Sets the
skein.log.level
system property. One of {‘ALL’, ‘TRACE’, ‘DEBUG’, ‘INFO’, ‘WARN’, ‘ERROR’, ‘FATAL’, ‘OFF’} (from most to least verbose). Default is ‘INFO’.- java_optionsstr or list of str, optional
Additional Java options to forward to the driver. Can also be configured by setting the environment variable
SKEIN_DRIVER_JAVA_OPTIONS
.
- Returns
- addressstr
The address of the driver
-
static
stop_global_driver
(force=False)¶ Stops the global driver if running.
No-op if no global driver is running.
- Parameters
- forcebool, optional
By default skein will check that the process associated with the driver PID is actually a skein driver. Setting
force
toTrue
will kill the process in all cases.
-
submit
(spec)¶ Submit a new skein application.
- Parameters
- specApplicationSpec, str, or dict
A description of the application to run. Can be an
ApplicationSpec
object, a path to a yaml/json file, or a dictionary description of an application specification.
- Returns
- app_idstr
The id of the submitted application.
-
submit_and_connect
(spec)¶ Submit a new skein application, and wait to connect to it.
If an error occurs before the application connects, the application is killed.
- Parameters
- specApplicationSpec, str, or dict
A description of the application to run. Can be an
ApplicationSpec
object, a path to a yaml/json file, or a dictionary description of an application specification.
- Returns
- app_clientApplicationClient
Application Client¶
-
class
skein.
ApplicationClient
(address, app_id, security=None)¶ A client for the application master.
Used to interact with a running application.
- Parameters
- addressstr
The address of the application master.
- app_idstr
The application id
- securitySecurity, optional
The security configuration to use to communicate with the application master. Defaults to the global configuration.
-
add_container
(service, env=None)¶ Add a new container to a service.
Unlike
scale
, this adds the ability to override configuration of the requested container.- Parameters
- servicestr
The service to scale.
- envdict, optional
A mapping of environment variables to set in the container. These will be applied after any environment variables set in the service description, and can be used as overrides.
- Returns
- containerContainer
The new container that was started.
-
close
()¶ Release all resources used by this client. Idempotent.
Note that this only frees resources used by the client, it doesn’t shutdown the application itself. For that see
shutdown
.
-
classmethod
from_current
()¶ Create an application client from within a running container.
Useful for connecting to the application master from a running container in a application.
-
get_containers
(services=None, states=None)¶ Get information on containers in this application.
- Parameters
- servicessequence of str, optional
If provided, containers will be filtered to these services. Default is all services.
- statessequence of ContainerState, optional
If provided, containers will be filtered by these container states. Default is
['WAITING', 'REQUESTED', 'RUNNING']
.
- Returns
- containerslist of Container
-
get_specification
()¶ Get the specification for the running application.
- Returns
- specApplicationSpec
-
kill_container
(id)¶ Kill a container.
- Parameters
- idstr
The id of the container to kill.
-
kv
¶ The Skein Key-Value store.
Used by applications to coordinate configuration and global state.
This implements the standard MutableMapping interface, along with the ability to “wait” for keys to be set. Keys are strings, with values as bytes.
Examples
>>> app_client.kv['foo'] = b'bar' >>> app_client.kv['foo'] b'bar' >>> del app_client.kv['foo'] >>> 'foo' in app_client.kv False
Wait until the key is set, either by another service or by a user client. This is useful for inter-service synchronization.
>>> app_client.kv.wait('mykey')
-
scale
(service, count=None, delta=None, **kwargs)¶ Scale a service to a requested number of instances.
Adds or removes containers to match the requested number of instances. The number of instances for the service can be specified either as a total count or a delta in that count.
When choosing which containers to remove, containers are removed in order of state (
WAITING
,REQUESTED
,RUNNING
) followed by age (oldest to newest).When specified as a negative
delta
, if the number of removed containers is greater than the number of existing containers, all containers are removed rather than throwing an error. This means thatapp.scale(delta=-1)
will remove a container if one exists, otherwise it will do nothing.- Parameters
- servicestr
The service to scale.
- countint, optional
The number of instances to scale to.
- deltaint, optional
The change in number of instances.
- Returns
- containerslist of Container
A list of containers that were started or stopped.
-
set_progress
(progress)¶ Update the progress for this application.
For applications processing a fixed set of work it may be useful for diagnostics to set the progress as the application processes.
Progress indicates job progression, and must be a float between 0 and 1. By default the progress is set at 0.1 for its duration, which is a good default value for applications that don’t know their progress, (e.g. interactive applications).
- Parameters
- progressfloat
The application progress, must be a value between 0 and 1.
-
shutdown
(status='SUCCEEDED', diagnostics=None)¶ Shutdown the application.
Stop all running containers and shutdown the application.
- Parameters
- statusFinalStatus, optional
The final application status. Default is ‘SUCCEEDED’.
- diagnosticsstr, optional
The application exit message, usually used for diagnosing failures. Can be seen in the YARN Web UI for completed applications under “diagnostics”, as well as the
diagnostic
field ofApplicationReport
objects. If not provided, a default will be used.
-
ui
¶ The Skein Web UI.
Used by applications to register additional web pages, and to get the addresses of these pages.
Runtime Properties¶
-
skein.
properties
= <skein.core.Properties object>¶ Skein runtime properties.
This class implements an immutable mapping type, exposing properties determined at import time.
- Attributes
- application_idstr or None
The current application id. None if not running in a container.
- appmaster_addressstr or None
The address of the current application’s appmaster. None if not running in a container.
- config_dirstr
The path to the configuration directory.
- container_idstr or None
The current skein container id (of the form
'{service}_{instance}'
). None if not running in a container.- container_resourcesResources or None
The resources allocated to the current container. None if not in a container.
- container_dirstr or None
The absolute path to the working directory for this container. None if not in a container.
- yarn_container_idstr or None
The current YARN container id. None if not running in a container.
Key Value Store¶
-
class
skein.kv.
KeyValueStore
(client)¶ The Skein Key-Value store.
Used by applications to coordinate configuration and global state.
-
clear
() → None. Remove all items from D.¶
-
count
(start=None, end=None, prefix=None)¶ Count keys in the key-value store.
- Parameters
- startstr, optional
The lower bound of the key range, inclusive. If not provided no lower bound will be used.
- endstr, optional
The upper bound of the key range, exclusive. If not provided, no upper bound will be used.
- prefixstr, optional
If provided, will count the number keys matching this prefix.
- Returns
- int
-
discard
(key)¶ Discard a single key.
Returns true if the key was present, false otherwise.
- Parameters
- keystr
The key to discard.
- Returns
- bool
-
discard_prefix
(prefix, return_keys=False)¶ Discard all key-value pairs whose keys start with
prefix
.Returns either the number of keys discarded or a list of those keys, depending on the value of
return_keys
.- Parameters
- prefixstr
The key prefix.
- return_keysbool, optional
If True, the discarded keys will be returned instead of their count. Default is False.
- Returns
- int or list of keys
-
discard_range
(start=None, end=None, return_keys=False)¶ Discard a range of keys.
Returns either the number of keys discarded or a list of those keys, depending on the value of
return_keys
.- Parameters
- startstr, optional
The lower bound of the key range, inclusive. If not provided no lower bound will be used.
- endstr, optional
The upper bound of the key range, exclusive. If not provided, no upper bound will be used.
- return_keysbool, optional
If True, the discarded keys will be returned instead of their count. Default is False.
- Returns
- int or list of keys
-
event_queue
()¶ Create a new EventQueue subscribed to no events.
Examples
Subscribe to events starting with
'foo'
or'bar'
.>>> foo = skein.kv.EventFilter(prefix='foo') >>> bar = skein.kv.EventFilter(prefix='bar') >>> queue = app.kv.event_queue() >>> queue.subscribe(foo) >>> queue.subscribe(bar) >>> for event in queue: ... if event.filter == foo: ... print("foo event") ... else: ... print("bar event")
-
events
(event_filter=None, key=None, prefix=None, start=None, end=None, event_type=None)¶ Shorthand for creating an EventQueue and adding a single filter.
May provide either an explicit event filter, or provide arguments to create a new one and add it to the queue.
If no arguments are provided, creates a queue subscribed to all events.
- Parameters
- event_filterEventFilter
An explicit EventFilter. If provided, no other keyword arguments may be provided.
- keystr, optional
If present, only events from this key will be selected.
- prefixstr, optional
If present, only events with this key prefix will be selected.
- startstr, optional
If present, specifies the lower bound of the key range, inclusive.
- endstr, optional
If present, specifies the upper bound of the key range, exclusive.
- event_typeEventType, optional.
The type of event. Default is
'ALL'
.
- Returns
- EventQueue
Examples
Subscribe to all events with prefix
'foo'
:>>> for event in app.kv.events(prefix='foo'): ... if event.type == 'PUT': ... print("PUT<key=%r, value=%r>" % (event.key, event.value)) ... else: # DELETE ... print("DELETE<key=%r>" % event.key) PUT<key='foo', value=b'bar'> PUT<key='food', value=b'biz'> DELETE<key='food'> PUT<key='foo', value=b'changed'>
-
exists
(key)¶ Check if a key exists in the key-value store.
- Parameters
- keystr
The key to check the presence of.
- Returns
- bool
-
get
(key, default=None, return_owner=False)¶ Get the value associated with a single key.
- Parameters
- keystr
The key to get.
- defaultbytes or None, optional
Default value to return if the key is not present.
- return_ownerbool, optional
If True, the owner will also be returned along with the value. Default is False.
- Returns
- bytes or ValueOwnerPair
-
get_prefix
(prefix, return_owner=False)¶ Get all key-value pairs whose keys start with
prefix
.- Parameters
- prefixstr
The key prefix.
- return_ownerbool, optional
If True, the owner will also be returned along with the value. Default is False.
- Returns
- OrderedDict
-
get_range
(start=None, end=None, return_owner=False)¶ Get a range of keys.
- Parameters
- startstr, optional
The lower bound of the key range, inclusive. If not provided no lower bound will be used.
- endstr, optional
The upper bound of the key range, exclusive. If not provided, no upper bound will be used.
- return_ownerbool, optional
If True, the owner will also be returned along with the value. Default is False.
- Returns
- OrderedDict
-
items
() → a set-like object providing a view on D’s items¶
-
keys
() → a set-like object providing a view on D’s keys¶
-
list_keys
(start=None, end=None, prefix=None)¶ Get a list of keys in the key-value store.
- Parameters
- startstr, optional
The lower bound of the key range, inclusive. If not provided no lower bound will be used.
- endstr, optional
The upper bound of the key range, exclusive. If not provided, no upper bound will be used.
- prefixstr, optional
If provided, will return all keys matching this prefix.
- Returns
- list of keys
-
missing
(key)¶ Check if a key is not in the key-value store.
This is the inverse of
exists
.- Parameters
- keystr
The key to check the absence of.
- Returns
- bool
-
pop
(key, default=None, return_owner=False)¶ Remove a single key and return its corresponding value.
- Parameters
- keystr
The key to pop.
- defaultbytes or None, optional
Default value to return if the key is not present.
- return_ownerbool, optional
If True, the owner will also be returned along with the value. Default is False.
- Returns
- bytes or ValueOwnerPair
-
pop_prefix
(prefix, return_owner=False)¶ Remove all key-value pairs whose keys start with
prefix
, and return their corresponding values.- Parameters
- prefixstr
The key prefix.
- return_ownerbool, optional
If True, the owner will also be returned along with the value. Default is False.
- Returns
- OrderedDict
-
pop_range
(start=None, end=None, return_owner=False)¶ Remove a range of keys and return their corresponding values.
- Parameters
- startstr, optional
The lower bound of the key range, inclusive. If not provided no lower bound will be used.
- endstr, optional
The upper bound of the key range, exclusive. If not provided, no upper bound will be used.
- return_ownerbool, optional
If True, the owner will also be returned along with the value. Default is False.
- Returns
- OrderedDict
-
popitem
() → (k, v), remove and return some (key, value) pair¶ as a 2-tuple; but raise KeyError if D is empty.
-
put
(key, value=no_change, owner=no_change)¶ Assign a value and/or owner for a single key.
- Parameters
- keystr
The key to put.
- valuebytes, optional
The value to put. Default is to leave value unchanged; an error will be raised if the key doesn’t exist.
- ownerstr or None, optional
The container id to claim ownership. Provide
None
to set to no owner. Default is to leave value unchanged.
-
setdefault
(key, default)¶ Get the value associated with key, setting it to default if not present.
This transaction happens atomically on the key-value store.
- Parameters
- keystr
The key
- defaultbytes
The default value to set if the key isn’t present.
- Returns
- valuebytes
-
swap
(key, value=no_change, owner=no_change, return_owner=False)¶ Assign a new value and/or owner for a single key, and return the previous value.
- Parameters
- keystr
The key to put.
- valuebytes, optional
The value to put. Default is to leave value unchanged; an error will be raised if the key doesn’t exist.
- ownerstr or None, optional
The container id to claim ownership. Provide
None
to set to no owner. Default is to leave value unchanged.- return_ownerbool, optional
If True, the owner will also be returned along with the value. Default is False.
- Returns
- bytes or ValueOwnerPair
-
transaction
(conditions=None, on_success=None, on_failure=None)¶ An atomic transaction on the key-value store.
- Parameters
- conditionsCondition or sequence of Conditions, optional
A sequence of conditions to evaluate together. The conditional expression succeeds if all conditions evaluate to True, and fails otherwise. If no conditions are provided the conditional expression also succeeds.
- on_successOperation or sequence of Operation, optional
A sequence of operations to apply if all conditions evaluate to True.
- on_failureOperation or sequence of Operation, optional
A sequence of operations to apply if any condition evaluates to False.
- Returns
- resultTransactionResult
A namedtuple of (succeeded, results), where results is a list of results from either the
on_success
oron_failure
operations, depending on which branch was evaluated.
Examples
This implements an atomic compare-and-swap operation, a useful concurrency primitive. It sets
key
tonew
only if it currently isprev
:>>> from skein import kv >>> def compare_and_swap(app, key, new, prev): ... result = app.kv.transaction( ... conditions=[kv.value(key) == prev], # if key == prev ... on_success=[kv.put(key, new)]) # then set key = new ... return result.succeeded
>>> app.kv['key'] = b'value'
Since
'key'
currently isb'value'
, the conditional expression succeeds and'key'
is set tob'new_value'
>>> compare_and_swap(app, 'key', b'new_value', b'value') True
Since
'key'
currently isb'value'
and notb'wrong'
, the conditional expression fails and'key'
remains unchanged.>>> compare_and_swap(app, 'key', b'another_value', b'wrong') False
-
update
(*args, **kwargs)¶ Update the key-value store with multiple key-value pairs atomically.
- Parameters
- argmapping or iterable, optional
Either a mapping or an iterable of
(key, value)
.- **kwargs
Extra key-value pairs to set. Semantically these are applied after any present in
arg
, and will thus override any intersecting keys between the two.
-
values
() → an object providing a view on D’s values¶
-
wait
(key, return_owner=False)¶ Get the value associated with a single key, blocking until the key exists if not present.
- Parameters
- keystr
The key to get.
- return_ownerbool, optional
If True, the owner will also be returned along with the value. Default is False.
- Returns
- bytes or ValueOwnerPair
-
-
class
skein.kv.
ValueOwnerPair
(value, owner)¶ A (value, owner) pair in the key-value store.
- Parameters
- valuebytes
The value.
- ownerstr or None
The owner container_id, or None for no owner.
-
class
skein.kv.
count
(start=None, end=None, prefix=None)¶ A request to count keys in the key-value store.
- Parameters
- startstr, optional
The lower bound of the key range, inclusive. If not provided no lower bound will be used.
- endstr, optional
The upper bound of the key range, exclusive. If not provided, no upper bound will be used.
- prefixstr, optional
If provided, will count the number keys matching this prefix.
-
class
skein.kv.
count
(start=None, end=None, prefix=None)¶ A request to count keys in the key-value store.
- Parameters
- startstr, optional
The lower bound of the key range, inclusive. If not provided no lower bound will be used.
- endstr, optional
The upper bound of the key range, exclusive. If not provided, no upper bound will be used.
- prefixstr, optional
If provided, will count the number keys matching this prefix.
-
class
skein.kv.
list_keys
(start=None, end=None, prefix=None)¶ A request to get a list of keys in the key-value store.
- Parameters
- startstr, optional
The lower bound of the key range, inclusive. If not provided no lower bound will be used.
- endstr, optional
The upper bound of the key range, exclusive. If not provided, no upper bound will be used.
- prefixstr, optional
If provided, will return all keys matching this prefix.
-
class
skein.kv.
exists
(key)¶ A request to check if a key exists in the key-value store.
- Parameters
- keystr
The key to check the presence of.
-
class
skein.kv.
missing
(key)¶ A request to check if a key is not in the key-value store.
This is the inverse of
exists
.- Parameters
- keystr
The key to check the absence of.
-
class
skein.kv.
get
(key, default=None, return_owner=False)¶ A request to get the value associated with a single key.
- Parameters
- keystr
The key to get.
- defaultbytes or None, optional
Default value to return if the key is not present.
- return_ownerbool, optional
If True, the owner will also be returned along with the value. Default is False.
-
class
skein.kv.
get_prefix
(prefix, return_owner=False)¶ A request to get all key-value pairs whose keys start with
prefix
.- Parameters
- prefixstr
The key prefix.
- return_ownerbool, optional
If True, the owner will also be returned along with the value. Default is False.
-
class
skein.kv.
get_range
(start=None, end=None, return_owner=False)¶ A request to get a range of keys.
- Parameters
- startstr, optional
The lower bound of the key range, inclusive. If not provided no lower bound will be used.
- endstr, optional
The upper bound of the key range, exclusive. If not provided, no upper bound will be used.
- return_ownerbool, optional
If True, the owner will also be returned along with the value. Default is False.
-
class
skein.kv.
pop
(key, default=None, return_owner=False)¶ A request to remove a single key and return its corresponding value.
- Parameters
- keystr
The key to pop.
- defaultbytes or None, optional
Default value to return if the key is not present.
- return_ownerbool, optional
If True, the owner will also be returned along with the value. Default is False.
-
class
skein.kv.
pop_prefix
(prefix, return_owner=False)¶ A request to remove all key-value pairs whose keys start with
prefix
, and return their corresponding values.- Parameters
- prefixstr
The key prefix.
- return_ownerbool, optional
If True, the owner will also be returned along with the value. Default is False.
-
class
skein.kv.
pop_range
(start=None, end=None, return_owner=False)¶ A request to remove a range of keys and return their corresponding values.
- Parameters
- startstr, optional
The lower bound of the key range, inclusive. If not provided no lower bound will be used.
- endstr, optional
The upper bound of the key range, exclusive. If not provided, no upper bound will be used.
- return_ownerbool, optional
If True, the owner will also be returned along with the value. Default is False.
-
class
skein.kv.
discard
(key)¶ A request to discard a single key.
Returns true if the key was present, false otherwise.
- Parameters
- keystr
The key to discard.
-
class
skein.kv.
discard_prefix
(prefix, return_keys=False)¶ A request to discard all key-value pairs whose keys start with
prefix
.Returns either the number of keys discarded or a list of those keys, depending on the value of
return_keys
.- Parameters
- prefixstr
The key prefix.
- return_keysbool, optional
If True, the discarded keys will be returned instead of their count. Default is False.
-
class
skein.kv.
discard_range
(start=None, end=None, return_keys=False)¶ A request to discard a range of keys.
Returns either the number of keys discarded or a list of those keys, depending on the value of
return_keys
.- Parameters
- startstr, optional
The lower bound of the key range, inclusive. If not provided no lower bound will be used.
- endstr, optional
The upper bound of the key range, exclusive. If not provided, no upper bound will be used.
- return_keysbool, optional
If True, the discarded keys will be returned instead of their count. Default is False.
-
class
skein.kv.
put
(key, value=no_change, owner=no_change)¶ A request to assign a value and/or owner for a single key.
- Parameters
- keystr
The key to put.
- valuebytes, optional
The value to put. Default is to leave value unchanged; an error will be raised if the key doesn’t exist.
- ownerstr or None, optional
The container id to claim ownership. Provide
None
to set to no owner. Default is to leave value unchanged.
-
class
skein.kv.
swap
(key, value=no_change, owner=no_change, return_owner=False)¶ A request to assign a new value and/or owner for a single key, and return the previous value.
- Parameters
- keystr
The key to put.
- valuebytes, optional
The value to put. Default is to leave value unchanged; an error will be raised if the key doesn’t exist.
- ownerstr or None, optional
The container id to claim ownership. Provide
None
to set to no owner. Default is to leave value unchanged.- return_ownerbool, optional
If True, the owner will also be returned along with the value. Default is False.
-
class
skein.kv.
TransactionResult
(succeeded, results)¶ A result from a key-value store transaction.
- Parameters
- succeededbool
Whether the transaction conditions evaluated to True.
- resultssequence
A sequence of results from applying all operations in the transaction
on_success
oron_failure
parameters, depending on whether the conditions evaluated to True or False.
-
class
skein.kv.
value
(key)¶ Represents the value for a key, for use in transaction conditions.
- Parameters
- keystr
The key to lookup
-
class
skein.kv.
owner
(key)¶ Represents the owner for a key, for use in transaction conditions.
- Parameters
- keystr
The key to lookup
-
class
skein.kv.
comparison
(key, field, operator, rhs)¶ A comparison of the value or owner for a specified key.
- Parameters
- keystr
The corresponding key.
- field{‘value’, ‘owner’}
The field to compare on.
- operator{‘==’, ‘!=’, ‘>’, ‘>=’, ‘<’, ‘<=’}
The comparison operator to use.
- rhsbytes, str or None
The right-hand-side of the condition expression. Must be a
bytes
iffield='value'
, orstr
orNone
iffield='owner'
.
-
skein.kv.
is_condition
(obj)¶ Return if
x
is a valid skein key-value store condition
-
skein.kv.
is_operation
(obj)¶ Return if
obj
is a valid skein key-value store operation
-
class
skein.kv.
EventType
(x)¶ Event types to listen on.
- Attributes
- ALLEventType
All events.
- PUTEventType
Only
PUT
events.- DELETEEventType
Only
DELETE
events.
-
class
skein.kv.
Event
(key, result, event_type, event_filter)¶ An event in the key-value store.
- Parameters
- keystr
The key affected.
- resultValueOwnerPair or None
The value and owner for the key. None if a
'DELETE'
event.- event_typeEventType
The type of event.
- event_filterEventFilter
The event filter that generated the event.
-
class
skein.kv.
EventFilter
(key=None, prefix=None, start=None, end=None, event_type=None)¶ An event filter.
Specifies a subset of events to watch for. May specify one of
key
,prefix
, orstart
/end
. If no parameters are provided, selects all events.- Parameters
- keystr, optional
If present, only events from this key will be selected.
- prefixstr, optional
If present, only events with this key prefix will be selected.
- startstr, optional
If present, specifies the lower bound of the key range, inclusive.
- endstr, optional
If present, specifies the upper bound of the key range, exclusive.
- event_typeEventType, optional.
The type of event. Default is
'ALL'
-
class
skein.kv.
EventQueue
(kv)¶ A queue of events on the key-value store.
Besides the normal
Queue
interface, also supports iteration.>>> for event in app.kv.events(prefix='bar'): ... print(event)
If an event falls into multiple selected filters, it will be placed in the event queue once for each filter. For example,
prefix='bar'
andkey='bart'
would both recieve events onkey='bart'
. If a queue was subscribed to both events, changes to this key would be placed in the queue twice, once for each filter.All events are unsubscribed when this object is collected. Can also be used as a contextmanager to unsubscribe-all on
__exit__
, or explicitly callunsubscribe_all
.-
get
(block=True, timeout=None)¶ Remove and return an item from the queue.
If optional args ‘block’ is true and ‘timeout’ is None (the default), block if necessary until an item is available. If ‘timeout’ is a non-negative number, it blocks at most ‘timeout’ seconds and raises the Empty exception if no item was available within that time. Otherwise (‘block’ is false), return an item if one is immediately available, else raise the Empty exception (‘timeout’ is ignored in that case).
-
put
(item, block=True, timeout=None)¶ Put an item into the queue.
If optional args ‘block’ is true and ‘timeout’ is None (the default), block if necessary until a free slot is available. If ‘timeout’ is a non-negative number, it blocks at most ‘timeout’ seconds and raises the Full exception if no free slot was available within that time. Otherwise (‘block’ is false), put an item on the queue if a free slot is immediately available, else raise the Full exception (‘timeout’ is ignored in that case).
-
subscribe
(event_filter=None, key=None, prefix=None, start=None, end=None, event_type=None)¶ Subscribe to an event filter.
May provide either an explicit event filter, or provide arguments to create a new one and add it to the queue. In either case, the event filter is returned.
If no arguments are provided, subscribes to all events.
- Parameters
- event_filterEventFilter
An explicit EventFilter. If provided, no other keyword arguments may be provided.
- keystr, optional
If present, only events from this key will be selected.
- prefixstr, optional
If present, only events with this key prefix will be selected.
- startstr, optional
If present, specifies the lower bound of the key range, inclusive.
- endstr, optional
If present, specifies the upper bound of the key range, exclusive.
- event_typeEventType, optional.
The type of event. Default is
'ALL'
.
- Returns
- EventFilter
-
unsubscribe
(event_filter=None, key=None, prefix=None, start=None, end=None, event_type=None)¶ Unsubscribe from an event filter.
May provide either an explicit event filter, or provide arguments to create a new one and add it to the queue.
If no arguments are provided, unsubscribes from a filter of all events.
A
ValueError
is raised if the specified filter isn’t currently subscribed to.- Parameters
- event_filterEventFilter
An explicit EventFilter. If provided, no other keyword arguments may be provided.
- keystr, optional
If present, only events from this key will be selected.
- prefixstr, optional
If present, only events with this key prefix will be selected.
- startstr, optional
If present, specifies the lower bound of the key range, inclusive.
- endstr, optional
If present, specifies the upper bound of the key range, exclusive.
- event_typeEventType, optional.
The type of event. Default is
'ALL'
.
- Returns
- EventFilter
-
unsubscribe_all
()¶ Unsubscribe from all event filters
-
Web UI¶
-
class
skein.ui.
WebUI
(client)¶ The Skein WebUI.
-
property
proxy_prefix
¶ The path between the Web UI address and the proxied pages.
-
property
addresses
¶ All known addresses of the Web UI.
In most cases this will be a list of length 1. If YARN is running in high availability mode, there may be several available addresses. Use the
address
attribute to choose one at random.
-
property
address
¶ The address of the Web UI.
If YARN is running in high availability mode, a single address is chosen at random on first access.
-
add_page
(route, target, link_name=None)¶ Add a new proxied page to the Web UI.
- Parameters
- routestr
The route for the proxied page. Must be a valid path segment in a url (e.g.
foo
in/foo/bar/baz
). Routes must be unique across the application.- targetstr
The target address to be proxied to this page. Must be a valid url.
- link_namestr, optional
If provided, will be the link text used in the Web UI. If not provided, the page will still be proxied, but no link will be added to the Web UI. Link names must be unique across the application.
- Returns
- ProxiedPage
-
remove_page
(route)¶ Remove a proxied page from the Web UI.
- Parameters
- routestr
The route for the proxied page. Must be a valid path segment in a url (e.g.
foo
in/foo/bar/baz
). Routes must be unique across the application.
-
get_pages
()¶ Get all registered pages.
- Returns
- pagesdict
A
dict
ofroute
toProxiedPage
for all pages.
-
property
-
class
skein.ui.
ProxiedPage
(route, target, link_name, _ui_address, _proxy_prefix)¶ A page proxied by the Skein Web UI.
- Attributes
- routestring
The route used in the address to the proxied page.
- targetstring
The target address of the proxy.
- link_namestring or None
The name of the linked page in the Web UI. If
None
, the page is still proxied but isn’t linked to in the Web UI.address
stringThe full proxied address to this page
-
property
address
¶ The full proxied address to this page
Application Specification¶
-
class
skein.
ApplicationSpec
(services=None, master=None, name='skein', queue='default', user='', node_label='', tags=None, file_systems=None, acls=None, max_attempts=1)¶ A complete description of an application.
- Parameters
- servicesdict, optional
A mapping of service-name to services. Applications must either specify at least one service, or a script for the application master to run (see
skein.Master
for more information).- masterMaster, optional
Additional configuration for the application master service. See
skein.Master
for more information.- namestr, optional
The name of the application, defaults to ‘skein’.
- queuestr, optional
The queue to submit to. Defaults to the default queue.
- userstr, optional
The user name to submit the application as. Requires that the submitting user have permission to proxy as this user name. Default is the submitter’s user name.
- node_labelstr, optional
The node label expression to use when requesting containers for this application. Services can override this setting by specifying
node_label
on the service directly. Default is no label.- tagsset, optional
A set of strings to use as tags for this application.
- file_systemslist, optional
A list of Hadoop file systems to acquire delegation tokens for. A token is always acquired for the
defaultFS
.- aclsACLs, optional
Allows restricting users/groups to subsets of application access. See
skein.ACLs
for more information.- max_attemptsint, optional
The maximum number of submission attempts before marking the application as failed. Note that this only considers failures of the application master during startup. Default is 1.
-
classmethod
from_dict
(obj, **kwargs)¶ Create an instance from a dict.
Keys in the dict should match parameter names
-
classmethod
from_file
(path, format='infer')¶ Create an instance from a json or yaml file.
- Parameters
- pathstr
The path to the file to load.
- format{‘infer’, ‘json’, ‘yaml’}, optional
The file format. By default the format is inferred from the file extension.
-
classmethod
from_json
(b)¶ Create an instance from a json string.
Keys in the json object should match parameter names
-
classmethod
from_protobuf
(obj)¶ Create an instance from a protobuf message.
-
classmethod
from_yaml
(b)¶ Create an instance from a yaml string.
-
to_dict
(skip_nulls=True)¶ Convert object to a dict
-
to_file
(path, format='infer', skip_nulls=True)¶ Write object to a file.
- Parameters
- pathstr
The path to the file to load.
- format{‘infer’, ‘json’, ‘yaml’}, optional
The file format. By default the format is inferred from the file extension.
- skip_nullsbool, optional
By default null values are skipped in the output. Set to True to output all fields.
-
to_json
(skip_nulls=True)¶ Convert object to a json string
-
to_protobuf
()¶ Convert object to a protobuf message
-
to_yaml
(skip_nulls=True)¶ Convert object to a yaml string
-
class
skein.
LogLevel
(x)¶ Enum of log levels.
Corresponds with
log4j
logging levels.- Attributes
- OFFLogLevel
Turns on all logging.
- TRACELogLevel
The finest level of events.
- DEBUGLogLevel
Fine-grained informational events that are most useful to debug an application.
- INFOLogLevel
Informational messages that highlight the progress of the application at a coarse-grained level. The default LogLevel.
- WARNLogLevel
Potentially harmful situations that still allow the application to continue running.
- ERRORLogLevel
Error events that might still allow the application to continue running.
- FATALLogLevel
Severe error events that will lead the application to abort.
- OFFLogLevel
Turns off all logging.
-
classmethod
values
()¶ The constants of this enum type, in the order they are declared.
-
class
skein.
Master
(resources=None, script='', files=None, env=None, log_level=LogLevel.INFO, log_config=None, security=None)¶ Configuration for the Application Master.
- Parameters
- resourcesResources, optional
Describes the resources needed to run the application master. Default is 512 MiB, 1 virtual core.
- scriptstr, optional
An optional bash script to run after starting the application master. If provided, the application will terminate once the script has completed.
- filesdict, optional
Describes any additional files needed to run the application master. A mapping of destination relative paths to
File
orstr
objects describing the sources for these paths. If astr
, the file type is inferred from the extension.- envdict, optional
A mapping of environment variables to set on the application master.
- log_levelstr or LogLevel, optional
The application master log level. Sets the
skein.log.level
system property. One of {‘ALL’, ‘TRACE’, ‘DEBUG’, ‘INFO’, ‘WARN’, ‘ERROR’, ‘FATAL’, ‘OFF’} (from most to least verbose). Default is ‘INFO’.- log_configstr or File, optional
A custom
log4j.properties
file to use for the application master. If not provided, the default logging configuration will be used.- securitySecurity, optional
The security credentials to use for the application master. If not provided, these will be the same as those used by the submitting client.
-
classmethod
from_dict
(obj, **kwargs)¶ Create an instance from a dict.
Keys in the dict should match parameter names
-
classmethod
from_json
(b)¶ Create an instance from a json string.
Keys in the json object should match parameter names
-
classmethod
from_protobuf
(obj)¶ Create an instance from a protobuf message.
-
classmethod
from_yaml
(b)¶ Create an instance from a yaml string.
-
to_dict
(skip_nulls=True)¶ Convert object to a dict
-
to_json
(skip_nulls=True)¶ Convert object to a json string
-
to_protobuf
()¶ Convert object to a protobuf message
-
to_yaml
(skip_nulls=True)¶ Convert object to a yaml string
-
class
skein.
Service
(resources=required, script=required, instances=1, files=None, env=None, depends=None, max_restarts=0, allow_failures=False, node_label='', nodes=None, racks=None, relax_locality=False)¶ Description of a Skein service.
- Parameters
- resourcesResources
Describes the resources needed to run the service.
- scriptstr
A bash script to run the service.
- instancesint, optional
The number of instances to create on startup. Default is 1.
- filesdict, optional
Describes any files needed to run the service. A mapping of destination relative paths to
File
orstr
objects describing the sources for these paths. If astr
, the file type is inferred from the extension.- envdict, optional
A mapping of environment variables needed to run the service.
- dependsset, optional
A set of service names that this service depends on. The service will only be started after all its dependencies have been started.
- max_restartsint, optional
The maximum number of restarts to allow for this service. Containers are only restarted on failure, and the cap is set for all containers in the service, not per container. Set to -1 to allow infinite restarts. Default is 0.
- allow_failuresbool, optional
If False (default), the whole application will shutdown if the number of failures for this service exceeds
max_restarts
. Set to True to keep the application running even if this service exceeds its failure limit.- node_labelstr, optional
The node label expression to use when requesting containers for this service. If not set, defaults to the application-level
node_label
(if set).- nodeslist, optional
A list of node host names to restrict containers for this service to. If not set, defaults to no node restrictions.
- rackslist, optional
A list of rack names to restrict containers for this service to. The racks corresponding to any nodes requested will be automatically added to this list. If not set, defaults to no rack restrictions.
- relax_localitybool, optional
If true, containers for this request may be assigned on hosts and racks other than the ones explicitly requested. If False, those restrictions are strictly enforced. Default is False.
-
classmethod
from_dict
(obj, **kwargs)¶ Create an instance from a dict.
Keys in the dict should match parameter names
-
classmethod
from_json
(b)¶ Create an instance from a json string.
Keys in the json object should match parameter names
-
classmethod
from_protobuf
(obj)¶ Create an instance from a protobuf message.
-
classmethod
from_yaml
(b)¶ Create an instance from a yaml string.
-
to_dict
(skip_nulls=True)¶ Convert object to a dict
-
to_json
(skip_nulls=True)¶ Convert object to a json string
-
to_protobuf
()¶ Convert object to a protobuf message
-
to_yaml
(skip_nulls=True)¶ Convert object to a yaml string
-
class
skein.
FileType
(x)¶ Enum of possible file types to distribute with the application.
- Attributes
- FILEFileType
Regular file
- ARCHIVEFileType
A
.zip
,.tar.gz
, or.tgz
file to be automatically unarchived in the containers.
-
classmethod
values
()¶ The constants of this enum type, in the order they are declared.
-
class
skein.
FileVisibility
(x)¶ Enum of possible file visibilities.
Determines how the file can be shared between containers.
- Attributes
- APPLICATIONFileVisibility
Shared only among containers of the same application on the node.
- PUBLICFileVisibility
Shared by all users on the node.
- PRIVATEFileVisibility
Shared among all applications of the same user on the node.
-
classmethod
values
()¶ The constants of this enum type, in the order they are declared.
-
class
skein.
File
(source=required, type='infer', visibility=FileVisibility.APPLICATION, size=0, timestamp=0)¶ A file/archive to distribute with the service.
- Parameters
- sourcestr
The path to the file/archive. If no scheme is specified, path is assumed to be on the local filesystem (
file://
scheme).- typeFileType or str, optional
The type of file to distribute. Archive’s are automatically extracted by yarn into a directory with the same name as their destination. By default the type is inferred from the file extension.
- visibilityFileVisibility or str, optional
The resource visibility, default is
FileVisibility.APPLICATION
- sizeint, optional
The resource size in bytes. If not provided will be determined by the file system.
- timestampint, optional
The time the resource was last modified. If not provided will be determined by the file system.
-
classmethod
from_dict
(obj, **kwargs)¶ Create an instance from a dict.
Keys in the dict should match parameter names
-
classmethod
from_json
(b)¶ Create an instance from a json string.
Keys in the json object should match parameter names
-
classmethod
from_protobuf
(obj)¶ Create an instance from a protobuf message.
-
classmethod
from_yaml
(b)¶ Create an instance from a yaml string.
-
to_dict
(skip_nulls=True)¶ Convert object to a dict
-
to_json
(skip_nulls=True)¶ Convert object to a json string
-
to_protobuf
()¶ Convert object to a protobuf message
-
to_yaml
(skip_nulls=True)¶ Convert object to a yaml string
-
class
skein.
Resources
(memory=required, vcores=required, gpus=0, fpgas=0)¶ Resource requests per container.
- Parameters
- memorystr or int
The amount of memory to request. Can be either a string with units (e.g.
"5 GiB"
), or numeric. If numeric, specifies the amount of memory in MiB. Note that the units are in mebibytes (MiB) NOT megabytes (MB) - the former being binary based (1024 MiB in a GiB), the latter being decimal based (1000 MB in a GB).Requests smaller than the minimum allocation will receive the minimum allocation (1024 MiB by default). Requests larger than the maximum allocation will error on application submission.
- vcoresint
The number of virtual cores to request. Depending on your system configuration one virtual core may map to a single actual core, or a fraction of a core. Requests larger than the maximum allocation will error on application submission.
- gpusint, optional
The number of gpus to request. Requires Hadoop >= 3.1, sets resource requirements for
yarn.io/gpu
. Default is 0.- fpgasint, optional
The number of fpgas to request. Requires Hadoop >= 3.1, sets resource requirements for
yarn.io/fpga
. Default is 0.
-
classmethod
from_dict
(obj)¶ Create an instance from a dict.
Keys in the dict should match parameter names
-
classmethod
from_json
(b)¶ Create an instance from a json string.
Keys in the json object should match parameter names
-
classmethod
from_protobuf
(msg)¶ Create an instance from a protobuf message.
-
classmethod
from_yaml
(b)¶ Create an instance from a yaml string.
-
to_dict
(skip_nulls=True)¶ Convert object to a dict
-
to_json
(skip_nulls=True)¶ Convert object to a json string
-
to_protobuf
()¶ Convert object to a protobuf message
-
to_yaml
(skip_nulls=True)¶ Convert object to a yaml string
-
class
skein.
ACLs
(enable=False, view_users=None, view_groups=None, modify_users=None, modify_groups=None, ui_users=None)¶ Skein Access Control Lists.
Maps access types to users/groups to provide that access.
The following access types are supported:
VIEW : view application details
MODIFY : modify the application via YARN (e.g. killing the application)
UI : access the application Web UI
The VIEW and MODIFY access types are handled by YARN directly; permissions for these can be set by users and/or groups. Authorizing UI access is handled by Skein internally, and only user-level access control is supported.
The application owner (the user who submitted the application) will always have permission for all access types.
By default, ACLs are disabled - to enable, set
enable=True
. If enabled, access is restricted only to the application owner by default - add users/groups to the access types you wish to expand to other users.- Parameters
- enablebool, optional
If True, the ACLs will be enforced. Default is False.
- view_users, view_groupslist, optional
Lists of users/groups to give VIEW access to this application. If both are empty, only the application owner has access (default). If either contains
"*"
, all users are given access (default). See the YARN documentation for more information on what VIEW access entails.- modify_users, modify_groupslist, optional
Lists of users/groups to give MODIFY access to this application. If both are empty, only the application owner has access (default). If either contains
"*"
, all users are given access (default). See the YARN documentation for more information on what MODIFY access entails.- ui_userslist, optional
A list of users to give access to the application Web UI. If empty, only the application owner has access (default). If it contains
"*"
, all users are given access.
Examples
By default ACLs are disabled, and all users have access.
>>> import skein >>> acls = skein.ACLs()
Enabling ACLs results in only the application owner having access (provided YARN is also configured with ACLs enabled).
>>> acls = skein.ACLs(enable=True)
To give access to other users, add users/groups to the desired access types. Here we enable view access for all users in group
engineering
, and modify access for usernancy
.>>> acls = skein.ACLs(enable=True, ... view_groups=['engineering'], ... modify_users=['nancy'])
You can use the wildcard character
"*"
to enable access for all users. Here we give view access to all users:>>> acls = skein.ACLs(enable=True, ... view_users=['*'])
-
classmethod
from_dict
(obj)¶ Create an instance from a dict.
Keys in the dict should match parameter names
-
classmethod
from_json
(b)¶ Create an instance from a json string.
Keys in the json object should match parameter names
-
classmethod
from_protobuf
(obj)¶ Create an instance from a protobuf message.
-
classmethod
from_yaml
(b)¶ Create an instance from a yaml string.
-
to_dict
(skip_nulls=True)¶ Convert object to a dict
-
to_json
(skip_nulls=True)¶ Convert object to a json string
-
to_protobuf
()¶ Convert object to a protobuf message
-
to_yaml
(skip_nulls=True)¶ Convert object to a yaml string
-
class
skein.
Security
(cert_file=None, key_file=None, cert_bytes=None, key_bytes=None)¶ Security configuration.
Secrets may be specified either as file paths or raw bytes, but not both.
- Parameters
- cert_filestr, File, optional
The TLS certificate file, in pem format. Either a path or a fully specified File object.
- key_filestr, File, optional
The TLS private key file, in pem format. Either a path or a fully specified File object.
- cert_bytesbytes, optional
The contents of the TLS certificate file, in pem format.
- key_bytesbytes, optional
The contents of the TLS private key file, in pem format.
-
classmethod
from_default
()¶ The default security configuration.
Usually this loads the credentials stored in the configuration directory (
~/.skein
by default). If these credentials don’t already exist, new ones will be created.When run in a YARN container started by Skein, this loads the same security credentials as used for the current application.
-
classmethod
from_dict
(obj, **kwargs)¶ Create an instance from a dict.
Keys in the dict should match parameter names
-
classmethod
from_directory
(directory)¶ Create a security object from a directory.
Relies on standard names for each file (
skein.crt
andskein.pem
).
-
classmethod
from_json
(b)¶ Create an instance from a json string.
Keys in the json object should match parameter names
-
classmethod
from_protobuf
(obj)¶ Create an instance from a protobuf message.
-
classmethod
from_yaml
(b)¶ Create an instance from a yaml string.
-
classmethod
new_credentials
()¶ Create a new Security object with a new certificate/key pair.
-
to_dict
(skip_nulls=True)¶ Convert object to a dict
-
to_directory
(directory, force=False)¶ Write this security object to a directory.
- Parameters
- directorystr
The directory to write the configuration to.
- forcebool, optional
If security credentials already exist at this location, an error will be raised by default. Set to True to overwrite existing files.
- Returns
- securitySecurity
A new security object backed by the written files.
-
to_json
(skip_nulls=True)¶ Convert object to a json string
-
to_protobuf
()¶ Convert object to a protobuf message
-
to_yaml
(skip_nulls=True)¶ Convert object to a yaml string
Application Responses¶
-
class
skein.model.
ApplicationState
(x)¶ Enum of application states.
- Attributes
- NEWApplicationState
Application was just created.
- NEW_SAVINGApplicationState
Application is being saved.
- SUBMITTEDApplicationState
Application has been submitted.
- ACCEPTEDApplicationState
Application has been accepted by the scheduler.
- RUNNINGApplicationState
Application is currently running.
- FINISHEDApplicationState
Application finished successfully.
- FAILEDApplicationState
Application failed.
- KILLEDApplicationState
Application was terminated by a user or admin.
-
classmethod
values
()¶ The constants of this enum type, in the order they are declared.
-
class
skein.model.
FinalStatus
(x)¶ Enum of application final statuses.
- Attributes
- SUCCEEDEDFinalStatus
Application finished successfully.
- KILLEDFinalStatus
Application was terminated by a user or admin.
- FAILEDFinalStatus
Application failed.
- UNDEFINEDFinalStatus
Application has not yet finished.
-
classmethod
values
()¶ The constants of this enum type, in the order they are declared.
-
class
skein.model.
ApplicationReport
(id, name, user, queue, tags, host, port, tracking_url, state, final_status, progress, usage, diagnostics, start_time, finish_time)¶ Report of application status.
- Parameters
- idstr
The application ID.
- namestr
The application name.
- userstr
The user that started the application.
- queuestr
The application queue.
- tagsset of strings
The application tags.
- hoststr
The host the application master is running on.
- portint
The rpc port for the application master
- tracking_urlstr
The application tracking url.
- stateApplicationState
The application state.
- final_statusFinalStatus
The application final status.
- progressfloat
The progress of the application, from 0.0 to 1.0.
- usageResourceUsageReport
Report on application resource usage.
- diagnosticsstr
The diagnostic message in the case of failures.
- start_timedatetime
The application start time.
- finish_timedatetime
The application finish time.
-
classmethod
from_protobuf
(obj)¶ Create an instance from a protobuf message.
-
property
runtime
¶ The total runtime of the container.
-
to_protobuf
()¶ Convert object to a protobuf message
-
class
skein.model.
ResourceUsageReport
(memory_seconds, vcore_seconds, num_used_containers, needed_resources, reserved_resources, used_resources)¶ Resource usage report.
- Parameters
- memory_secondsint
The total amount of memory (in MBs) the application has allocated times the number of seconds the application has been running.
- vcore_secondsint
The total number of vcores that the application has allocated times the number of seconds the application has been running.
- num_used_containersint
Current number of containers in use.
- needed_resourcesResources
The needed resources.
- reserved_resourcesResources
The reserved resources.
- used_resourcesResources
The used resources.
-
classmethod
from_protobuf
(obj)¶ Create an instance from a protobuf message.
-
to_protobuf
()¶ Convert object to a protobuf message
-
class
skein.model.
ContainerState
(x)¶ Enum of container states.
- Attributes
- WAITINGContainerState
Container is waiting on another service to startup before being requested.
- REQUESTEDContainerState
Container has been requested but is not currently running.
- RUNNINGContainerState
Container is currently running.
- SUCCEEDEDContainerState
Container finished successfully.
- FAILEDContainerState
Container failed.
- KILLEDContainerState
Container was terminated by a user or admin.
-
classmethod
values
()¶ The constants of this enum type, in the order they are declared.
-
class
skein.model.
Container
(service_name, instance, state, yarn_container_id, yarn_node_http_address, start_time, finish_time, exit_message)¶ Current container state.
- Parameters
- service_namestr
The name of the service this container is running.
- instanceint
The container instance number.
- stateContainerState
The current container state.
- yarn_container_idstr
The YARN container id.
- yarn_node_http_addressstr
The YARN node HTTP address given as
host:port
.- start_timedatetime
The start time, None if container has not started.
- finish_timedatetime
The finish time, None if container has not finished.
- exit_messagestr
The diagnostic exit message for completed containers.
-
classmethod
from_protobuf
(obj)¶ Create an instance from a protobuf message.
-
property
id
¶ The complete service_name & instance identity of this container.
-
property
runtime
¶ The total runtime of the application.
-
to_protobuf
()¶ Convert object to a protobuf message
-
class
skein.model.
ApplicationLogs
(app_id, logs)¶ A mapping of
yarn_container_id
to their logs for an application-
dump
(file=None)¶ Write the logs to a file or stdout.
- Parameters
- filefile-like, optional
A file-like object to write the logs to. Defaults to
sys.stdout
.
-
dumps
()¶ Write the logs to a string.
-
-
class
skein.model.
NodeState
(x)¶ Enum of node states.
- Attributes
- DECOMMISSIONEDNodeState
Node is out of service.
- DECOMMISSIONINGNodeState
Node is currently decommissioning.
- LOSTNodeState
Node has not sent responded for some time.
- NEWNodeState
New has just started.
- REBOOTEDNodeState
Node is just rebooted.
- RUNNINGNodeState
Node is currently running.
- SHUTDOWNNodeState
Node has been shutdown gracefully.
- UNHEALTHYNodeState
Node is unhealthy.
-
classmethod
values
()¶ The constants of this enum type, in the order they are declared.
-
class
skein.model.
NodeReport
(id, http_address, rack_name, labels, state, health_report, total_resources, used_resources)¶ Report of node status.
- Attributes
- idstr
The node id.
- http_addressstr
The http address to the node manager.
- rack_namestr
The rack name for this node.
- labelsset
Node labels for this node.
- stateNodeState
The node’s current state.
- health_reportstr
The diagnostic health report for this node.
- total_resourcesResources
Total resources available on this node.
- used_resourcesResources
Used resources available on this node.
-
classmethod
from_protobuf
(obj)¶ Create an instance from a protobuf message.
-
property
host
¶ The node manager host for this node.
-
property
port
¶ The node manager port for this node.
-
to_protobuf
()¶ Convert object to a protobuf message
-
class
skein.model.
QueueState
(x)¶ Enum of queue states.
- Attributes
- RUNNINGQueueState
Queue is running, normal operation.
- STOPPEDQueueState
Queue is stopped, no longer taking new requests.
-
classmethod
values
()¶ The constants of this enum type, in the order they are declared.
-
class
skein.model.
Queue
(name, state, capacity, max_capacity, percent_used, node_labels, default_node_label)¶ Information about a specific YARN queue.
- Attributes
- namestr
The queue’s name.
- stateQueueState
The queue’s state.
- capacityfloat
The queue’s capacity as a percentage. For the capacity scheduler, the queue is guaranteed access to this percentage of the parent queue’s resources (if sibling queues are running over their limit, there may be a lag accessing resources as those applications scale down). For the fair scheduler, this number is the percentage of the total cluster this queue currently has in its fair share (this will shift dynamically during cluster use).
- max_capacityfloat
The queue’s max capacity as a percentage. For the capacity scheduler, this queue may elastically expand to use up to this percentage of its parent’s resources if its siblings aren’t running at their capacity. For the fair scheduler this is always 100%.
- percent_usedfloat
The percent of this queue’s capacity that’s currently in use. This may be over 100% if elasticity is in effect.
- node_labelsset
A set of all accessible node labels for this queue. If all node labels are accessible this is the set
{"*"}
.- default_node_labelstr
The default node label for this queue. This will be used if the application doesn’t specify a node label itself.
-
classmethod
from_protobuf
(obj)¶ Create an instance from a protobuf message.
-
to_protobuf
()¶ Convert object to a protobuf message
Exceptions¶
-
exception
skein.
SkeinError
¶ Bases:
Exception
Base class for Skein specific exceptions
-
exception
skein.
ConnectionError
¶ Bases:
skein.exceptions.SkeinError
,ConnectionError
Failed to connect to the driver or application master
-
exception
skein.
DriverNotRunningError
¶ Bases:
skein.exceptions.ConnectionError
The driver process is not currently running
-
exception
skein.
ApplicationNotRunningError
¶ Bases:
skein.exceptions.ConnectionError
The application master is not currently running
-
exception
skein.
DriverError
¶ Bases:
skein.exceptions.SkeinError
Internal exceptions from the driver
-
exception
skein.
ApplicationError
¶ Bases:
skein.exceptions.SkeinError
Internal exceptions from the application master
Tornado Utilities¶
-
skein.tornado.
init_kerberos
(keytab=None, service='HTTP', hostname=None)¶ Initialize Kerberos authentication settings.
Should be called once on process startup, sets global settings.
- Parameters
- keytabstr, optional
Path to the keytab file. If not set, will check the
KRB5_KTNAME
environment variable, and error otherwise.- hostnamestr, optional
The hostname. If not provided, will be inferred from the system.
- servicestr, optional
The service name. The default of
"HTTP"
is usually sufficient.
-
class
skein.tornado.
KerberosAuthMixin
¶ A
tornado.web.RequestHandler
mixin for authenticating with Kerberos.Examples
A simple hello-world application:
from skein.tornado import init_kerberos, KerberosAuthMixin from tornado import web, ioloop # Create a handler with the KerberosAuthMixin as a base class class HelloHandler(KerberosAuthMixin, web.RequestHandler): @web.authenticated def get(self): self.write("Hello %s" % self.current_user) if __name__ == "__main__": # Initialize kerberos once per application init_kerberos(keytab="/path/to/my/service.keytab") # Serve web application app = web.Application([("/", HelloHandler)]) app.listen(8888) ioloop.IOLoop.current().start()
-
class
skein.tornado.
SimpleAuthMixin
¶ A
tornado.web.RequestHandler
mixin for authenticating using Hadoop’s “simple” protocol.In Hadoop, “simple” authentication uses a URL query parameter to specify the user, and isn’t secure at all. This mixin class exists for parity with Hadoop, but kerberos authentication is advised instead.
Examples
A simple hello-world application:
from skein.tornado import SimpleAuthMixin from tornado import web, ioloop # Create a handler with the SimpleAuthMixin as a base class class HelloHandler(SimpleAuthMixin, web.RequestHandler): @web.authenticated def get(self): self.write("Hello %s" % self.current_user) if __name__ == "__main__": # Serve web application app = web.Application([("/", HelloHandler)]) app.listen(8888) ioloop.IOLoop.current().start()