Debugging YARN applications can be tricky, as you often don’t have access to the remote nodes to debug failures. This page provides several tips for what to do when you encounter failures.
Accessing the Skein Web UI¶
For running applications, the Skein Web UI can provide useful information including:
What services are currently running in the application
Status of all current and past containers
Live links to logs for each service (these are especially useful)
Key-value pairs in the Key-Value Store
For more information, see the Web UI docs.
Accessing the Application Logs¶
When an application finishes, its logs are (usually) aggregated and made
available. They can be accessed using the
yarn logs cli command.
$ yarn logs -applicationId <Application ID>
The logs contain the
stderr for each service, as well as the
application master. This is a good first place to look when encountering an
unexpected failure or bug.
Configuring Logging in the Application Master¶
Skein’s Application Master uses Log4j for logging. When debugging issues in Skein itself, it can be useful to increase the log level to provide more information. This can be accomplished two different ways:
Change the logging level with the
debugis a good option).
Override the defaul log configuration by specifying a custom
log4j.propertiesfile in the specification. This allows you to increase the logging level for component libraries as well. See the Log4j documentation for more information on configuration files.
master: # Change the log level to debug log_level: debug # OR provide a custom log configuration file log_config: path/to/my/log4j.properties
Configuring Logging in the Client Driver¶
skein.Client uses a Java driver process to communicate with
services like YARN and HDFS. If you find issues in communicating with these
services (submitting, querying, or killing applications), it may be useful to
increase logging verbosity for the skein driver process (
debug is a good
option). There are a few ways to do this:
log_levelkeyword when creating a
SKEIN_LOG_LEVELenvironment variable (e.g.
--log-levelflag when starting a persistent driver using the CLI (e.g.
skein driver start --log-level debug).
Additionally, you may want to log to a file instead of to the terminal. There are also a few ways to do this:
logkeyword when creating a
--logflag when starting a persistent driver using the CLI.
# Create a client, logging to `driver.log` with "debug" log level import skein client = skein.Client(log_level='debug', log='driver.log')