.. _creating-custom-step: .. _custom_steps: ################ Step Development ################ A custom step should focus on only performing a single task and should return a result. If an exception is caught within the step, it must be re-raised or another exception should be raised that provides information on the issue. If a step does not raise an exception, it assumes the step successfully completed. This causes difficulty in tracking down issues as you may be looking at the incorrect step for the issue. A custom step has no pre-defined function signature or format. It's flexibility enables a step developer to request only the the :ref:`Standard Parameters ` they need. Depending on the step type, additional parameters may be required. .. note:: Standard Parameters differ from decorator parameters. Decorator parameters are used while registering the Step to the Snippet Framework while Standard Parameters are for the function signature. ************** Considerations ************** Before developing a |STEP_NAME|, ensure that there is not already a |STEP_NAME| or a way to chain |STEP_NAME_PLURAL| together to accomplish the task. This helps maintain a healthy set of |STEP_NAME_PLURAL| without overlap and reduces the chances of outdated code being propagated to different Dynamic Applications and/or SL1 stacks. Ensure to use security best practices when developing a |STEP_NAME|. For example, avoid logging any secure information (such as credentials). There are two cases when a Sciencelogic Library must be created: #. When creating a step and that step is intended to be used by many Dynamic Applications then it would be recommended to place the code in a ScienceLogic Library to eliminate the need to copy, paste, and potentially update it in multiple locations. #. When a step leverages a third-party wheel/library that is not included in the execution environment then that library must be included in a Sciencelogic Library. .. note:: Custom ScienceLogic Library development is not covered by this document. Refer to `documentation `_ for creating custom ScienceLogic Libraries. .. _req_id_gen: ********************************* Avoiding De-Duplication Conflicts ********************************* When creating a custom step, it is important to determine what makes a step `unique`. By defining a steps `uniqueness`, the Snippet Framework can perform optimizations by reducing the amount of time processing the same set of data. This is referred as the ``request_id`` and anything with the same ``request_id`` is determined to be the same operation. The default ``request_id`` is ``_`` which may or may not be sufficient in determining `uniqueness` for the step. If the step uses configuration outside of the ``step_args``, then a custom ``request_id`` may be required. A custom ``request_id`` generator should return a string that identifies the `uniqueness` for the step. For example, if you are running an SSH command you would expect to get the same results (within the same collection timeframe) if the following are the same between collections: * IP Address / Hostname * User * Command .. code-block:: python def ssh_req_id(credential, step_args): # Assume step_args is a dict, and if not, step_args is the # entire command try: command = step_args["command"] except TypeError: command = step_args return "{}:{}:{}".format( credential.fields["cred_host"], credential.fields["cred_user"], command, ) @register_requestor(get_req_id=ssh_req_id) def ssh_example(credential, step_args): pass ********************** Writing to Device Logs ********************** A custom step has the ability to write messages to the ``Device Logs``, which are visible in SL1. This is done by raising the exception ``silo.apps.errors.DeviceError`` with the first parameter being the log messages. .. code-block:: python from silo.apps.errors import DeviceError raise DeviceError("Device Log message goes here") .. note:: A device log message can only be 1024 characters. .. _standard_arguments: .. _standard_parameters: ******************* Standard Parameters ******************* When creating a custom step, it's important to understand the available parameters that are available for the custom step. The following parameters can be used by all the step types. .. list-table:: Available Step Parameters :header-rows: 1 * - Name - Type - Description * - ``step_args`` - object - The step_args object provides the ability to pass arguments from the snippet argument into the step. * - ``collection`` - Collection - Object that contains the following attributes related to the Collection. It has the following attributes: * ``DynamicApp``: Configuration related to the Dynamic Application * freq (int): Dynamic Application Frequency * id (int): Dynamic Application ID * gmtime (int): Timestamp for the collection * guid (str): Dynamic Application GUID * name (str): Name of the Dynamic Application * ``group``: Group ID for the collection * ``obj_id``: Object ID for the collection * ``argument``: Snippet Argument for the collection * ``type``: Class Type for the collection * - ``credential`` - CredentialObject - Object that contains the decrypted information for the aligned credential. It has the following attributes: * ``id``: Credential ID * ``name``: Credential Name * ``cred_type_name``: Name of the credential. If the credential type (``credential.fields["cred_type"]``) is universal, the name will be the credential display name. * 1: SNMP * 2: Database * 3: SOAP/XML * 4: LDAP/AD * 5: Basic/Snippet * 6: SSH/Key * 7: PowerShell * 8: Universal Credential Display Name * ``fields``: All other credential information as a dictionary. Each field in the credential will have a corresponding key within this dictionary. * - ``debug`` - callable - Function used for writing context-aware debug logs to the current collection and all de-duplicated collections. * - ``metadata`` - object - Metadata related to the collection. Updating this value will not update the metadata. To update the metadata, use ``set_metadata``. It is possible to update the existing reference without using ``set_metadata`` but it is prone to issues and not recommended. * - ``request_id`` - string - Request ID of the current step. The Request ID is unique path to the step. * - ``result`` - object - Finished object from the previous step. * - ``set_metadata`` - callable - Sets the metadata to the value provided. An example of this callable would be ``set_metadata(new_metadata)``. If the step requested ``metadata`` and ``set_metadata``, the ``metadata`` variable will not be updated with the recently set information from ``set_metadata``. The next step can request ``metadata`` and have the most up-to-date metadata. ********** Step Types ********** To register a step into the Snippet Framework, you must decorate the callable with the decorator that applies to your step type. For example, if you wanted to register a `Processor` you would use the following: .. code-block:: python @register_processor def count_items(result): pass .. note:: Python decorators are `wrappers` for function that enable additional functionality. Refer to `PEP-318 `_ for more details. The Snippet Framework has different types of steps that perform different operations. This allows steps to be more focused. The types of steps are as follows: * :ref:`Requestor ` - Retrieves the required data from the datasource. * :ref:`Processor ` - Perform an action on the result. For example, a processor may parse, transform, or format the data. These are only a few examples of what actions a processor may perform. * :ref:`Cacher ` - Attempts to fast-forward through already completed steps from cache. If no cache is found a cache element will be saved at this step. * :ref:`RequestMoreData ` - Enables looping within the Snippet Framework to collect additional information. An example of this would be pagination of a REST endpoint. .. _custom_dev_requestor: Requestor ========= A Requestor defines how to retrieve information from a single source-type. A Requester has access to all the :ref:`Standard Parameters `. Due to the uniqueness of a Requestor, a ``request_id`` generator should be written. Refer to :ref:`Avoiding De-Duplication Conflicts ` for more information. The returned value from the Requestor should be the result. When registering a Requestor, the following information can be specified in the decorator: .. autofunction:: silo.low_code.register_requestor :noindex: Examples -------- Returning the Step Args """""""""""""""""""""""" The following example will return the step argument as the result. .. code-block:: python @register_requestor def static_value(step_args): return step_args Returning the Step Args and updating metadata """"""""""""""""""""""""""""""""""""""""""""" The following example will return the step argument as the result and update the metadata .. code-block:: python @register_requestor def static_value(metadata, set_metadata, step_args): if not isinstance(metadata, dict): metadata = {} metadata["update"] = True set_metadata(metadata) return step_args Return Step Args and Include Port Check """"""""""""""""""""""""""""""""""""""" The following example will return the step argument as the result. There is also a validation check to ensure there is a correct credential. .. code-block:: python def port_check(credential): try: if int(credential.fields["cred_port"]) != 443: raise Exception("This requestor only supports port 443") except ValueError: raise Exception( "Invalid value specified for port. Expected int, but received {}".format( type(credential.fields["cred_port"]) ) ) @register_requestor( validate_request=port_check ) def static_value(step_args): return step_args Reading a File """""""""""""" The following example will show developing a custom Requestor that reads from a file. The file will be specified as step_args. .. code-block:: python @register_requestor( required_args=["file"], ) # Decorator that registers this step as a requestor def read_file(step_args): # Defines the step name and parameters to use # Opens and reads the file indicated in the step_args with open(step_args["file"], "r") as file: return file.read() http request """""""""""" The following example shows a basic http Requestor. It uses several parts from the decorator to ensure proper execution: * ``validate_request``: Validates the credential type is `Basic/Snippet` * ``get_req_id``: specifies request uniqueness * ``required_args``: states the URI is mandatory The code appears as follows: .. code-block:: python import hashlib import requests def cred_check(credential): if credential.fields.get("cred_type") != 5: raise Exception("Credential Type is incorrect Use Basic Snippet") def generate_request_id(credential, step_args, debug): key = credential.fields["cred_user"] + credential.fields["cred_host"] for arg in sorted(step_args): key = key + "|" + step_args[arg] hash_obj = hashlib.sha256() hash_obj.update(key.encode("utf-8")) return hash_obj.hexdigest() @register_requestor( required_args={"uri"}, validate_request=cred_check, get_req_id=generate_request_id ) def https_simple(step_args, credential, debug): # Adds specified arg to the previous result url = "https://" + credential.fields["cred_host"] + "/" + step_args["uri"] debug("URL: " + url) auth = (credential.fields["cred_user"]), credential.fields["cred_pwd"] response = requests.get(url, verify=False, auth=auth) # If the response was successful, no Exception will be raised response.raise_for_status() return response.content .. _custom_dev_processor: Processor ========= A Processor should perform a single operation on a result. A Processor has access to all the :ref:`Standard Parameters `. When creating custom Processor steps an important concept is the difference between a ``Parser`` and ``Selector``. A ``Parser`` should convert a data structure into a consumable format for a ``Selector``. By separating these two concepts you can produce a step that is more reusable than if a single step performed both actions. When registering a Processor, the following information can be specified in the decorator: .. autofunction:: silo.low_code.register_processor :noindex: Examples -------- Count items in a List """"""""""""""""""""" The following example shows how to create a Processor that counts the number of items in a list. .. code-block:: python @register_processor def count_items(result): # Only the result is passed into this function return len(result) # Returns the number of items in the result Wrapper Around a Custom Library """"""""""""""""""""""""""""""" The following example shows how to create a Processor that wraps a custom library, `jc `_. .. code-block:: python import jc @register_processor( name="jc", required_args=["parser_name"], ) def jcparser(result, step_args): """Run jc against the result :param str result: Result from the previous step :param object step_args: Argument supplied to the step :rtype: object """ parser_mod_name = jcparser_get_parser_mod_name(step_args) jc_kwargs = step_args if isinstance(step_args, dict) else {} return jc.parse(parser_mod_name, result, **jc_kwargs) def jcparser_get_parser_mod_name(step_args): """Get the parser name from the configuration :param object step_args: Step arguments supplied to the step :rtype: str """ if isinstance(step_args, str): parser_mod_name = step_args else: parser_mod_name = step_args.pop("parser_name") if parser_mod_name not in jcparser_get_supported_parsers(): raise FrameworkError( "jc or the Snippet Framework does not support parser {}".format(parser_mod_name) ) return parser_mod_name def jcparser_get_supported_parsers(): """Determine all supported jc parsers The Snippet Framework cannot consume streaming parsers and should be removed from the available list. :rtype: list """ try: return [x for x in jc.parser_mod_list() if not x.endswith("_s")] except ImportError: return [] .. _custom_dev_cacher: Cacher ======= A Cacher can store the current result or perform a fast-forward operation if the cache exists before executing. A Cache step does not have the ability to modify the result. A Cacher can reuse the results from a previous collection / Dynamic Application. A Cacher has access to all the :ref:`Standard Parameters `. A Cacher can optionally specify a ``read`` callable which allows the |FRAMEWORK_NAME| to fast-forward to the step after the ``Cacher``. This can be specified the in registration decorator utilizing the keyword parameter ``read``. When registering a Cacher, the following information can be specified in the decorator: .. autofunction:: silo.low_code.register_cacher :noindex: Example ------- Writing based on a provided key """"""""""""""""""""""""""""""" This sample Cacher will write the current data to the specified key. If a key is not specified the request_id will be used instead. .. code:: python def get_key(step_args, request_id): try: key = step_args.get("key", request_id) except AttributeError: key = request_id return key def cache_read(step_args, request_id, step_cache): return step_cache.read(get_key(step_args, request_id)) @register_cacher(read=cache_read) def cache_write(result, step_args, request_id, step_cache): step_cache.write(get_key(step_args, request_id), result) .. _custom_dev_rmd: RequestMoreData ================ RequestMoreData enables the Snippet Framework to perform a loop to collect additional data. It must be used in conjunction with a Requestor that supports ``rewind``. RequestMoreData step should raise an exception that inherits from ``silo.low_code.RequestMoreData`` or have no return. The exception that is raised should have information that is known to the previous Requestor that will inform the Requestor on how to perform the new request. If an exception is not raised, the following step will receive an OrderedDict containing all collected results. If the exception is raised, the current index (from ``set_index`` or the current `request_id`) and the current result will be inserted into the OrderedDict. If the same index is used twice in a row, the |FRAMEWORK_NAME| will identify this scenario as a repeating loop and end the RequestMoreData cycle and continue to the next |STEP_NAME|. .. note:: An OrderedDict is accessed the same way as a normal dict but the insertion order is preserved. The RequestMoreData step can request the ``set_max_iterations`` callable which sets the maximum number of times the |FRAMEWORK_NAME| will loop when gathering additional information. When registering a RequestMoreData, the following information can be specified in the decorator: .. autofunction:: silo.low_code.register_rmd :noindex: Example ------- .. _custom_step_rmd: Iterating over a static loop """""""""""""""""""""""""""" This sample Requestor will return the step argument as the result. It also supports rewind functionality where it will iterate until the number is 5 or greater. Since we are not specifying the index, the index will be calculated based on the request id for the step. Assuming that the initial number provided is 0, RequestMoreData will execute 5 times (0, 1, 2, 3, 4). When it processes ``rmd_step`` on the fifth iteration, it will not raise RequestMoreData. The final result will be ``{'static_value:1': 1, 'static_value:2': 2, 'static_value:3': 3, 'static_value:4': 4 'static_value:5_rmd_step_None': 5}``. The final result key is different due to how the framework processes the place within the loop. If you require consistent naming, it is best to use ``set_index`` to set the name. .. code-block:: yaml low_code: version: 2 steps: - static_value: 0 - rmd_step .. code-block:: python from silo.low_code import RequestMoreData @register_rmd def rmd_step(result): if result < 5: raise RequestMoreData(result=result) def range_increment(data_request): return data_request.result + 1 @register_requestor( rewind=range_increment ) def static_value(step_args): return int(step_args) This sample Requestor will return the step argument as the result. It also supports rewind functionality where it will iterate until the number is 5 or greater and increase by the current amount each iteration. This example will also set the index, which allows for easier lookups in the result if you add identifying information. Assuming that the initial number provided is 1, RequestMoreData will execute 3 times (1, 2, 4). When it processes ``rmd_step`` on the fourth iteration, it will not raise RequestMoreData. The final value will be ``{'offset_1': 1, 'offset_2': 2, 'offset_4': 4, 'offset_8': 8}`` .. code-block:: yaml low_code: version: 2 steps: - static_value: 1 - rmd_step .. code-block:: python from silo.low_code import RequestMoreData @register_rmd def rmd_step(result, set_index): set_index("offset_" + str(result)) if result < 5: raise RequestMoreData(result=result, amount=result) def range_increment(data_request): // Amount is the amount it increments each time return data_request.result + data_request.amount @register_requestor( rewind=range_increment ) def static_value(step_args): return int(step_args) ****************** Using the new Step ****************** After the step has been created and tested, it must be registered into the Snippet Framework. If the step is written within the Snippet, it will automatically be registered. However, if the Step is being included in a ScienceLogic Library, one of the following additional actions are required for the Step to be added to the Snippet Framework: * Create a wheel that includes the correct entry point (preferred method) * Update the default snippet to include the import .. _sf_auto_reg: Creating a wheel ================ A wheel is a standard Python package format used for distribution that provides the required metadata for installation. When creating a wheel, adding the entry_point `sf_step` enables the Snippet Framework to automatically register your step. In this example, we will assume that there is a package, ``my_custom_package``, that has defined ``__all__`` within ``my_custom_package.__init__.py``. Since all steps are imported when loading the package, the entry_point will use the top-level package. Below is an example of a snippet from ``setup.cfg`` that enables the auto-import. .. code-block:: cfg [options.entry_points] sf_step = my_custom_package = my_custom_package ***************** Advanced Features ***************** Utilizing Metadata Across Steps =============================== Metadata enables a step to store relevant information that will be referenced at a later time. This is useful when its not required for the result, but a later step will consume the metadata. In this example, Zillow API is used which provides a public API of housing data. This example will merge results on housing prices per city from one API call with rental prices per city from a different API call and then merge the results and calculate ROI. The first step takes the results from the API which contains a large number of columns relating to the historical sales price of homes and only takes the most recent datapoint as `2023-08-31`, which is defined in the snippet argument along with the other pieces of information and stores that into the metadata. Then the Rental API is called to get all the rental information. Finally the ``merge_data`` steps is then used to loop through the sales data and looking for a match in the rental data. When found a new record is created that contains data from both API calls along with the calculated ROI. .. code-block:: yaml low_code: version: 2 steps: - http: url: https://files.zillowstatic.com/research/public_csvs/zhvi/County_zhvi_uc_sfr_tier_0.33_0.67_sm_sa_month.csv?t=1695988357 - jc: csv - reduce_data: filter_date: 2023-08-31 - store_data: region_data - http: url: https://files.zillowstatic.com/research/public_csvs/zori/County_zori_uc_sfrcondomfr_sm_month.csv?t=1706113741 - jc: csv - merge_data: data_key: region_data filter_date: 2023-08-31 - jmespath: value: "[].{ROI: ROI, Location: join(', ', [County, State])}" .. code-block:: python import datetime @register_processor(required_args=["filter_date"]) def reduce_data(result, step_args): results = {} filter_date = step_args["filter_date"] if isinstance(filter_date, datetime.date): filter_date = filter_date.strftime("%Y-%m-%d") for region in result: region_data = {} price = region[filter_date].rsplit(".")[0] region_id = region["RegionID"] region_data["County"] = region["RegionName"] region_data["Price"] = int(price) if price else 0 region_data["State"] = region["State"] results[region_id] = region_data return results @register_processor(required_args=["data_key", "filter_date"]) def merge_data(result, step_args, metadata): results = [] data_key = step_args["data_key"] filter_date = step_args["filter_date"] if isinstance(filter_date, datetime.date): filter_date = filter_date.strftime("%Y-%m-%d") # Retrieve data previously stored using store_data sales_data = metadata[data_key] for region in result: region_id = region["RegionID"] sale_data = sales_data[region_id] sale_price = sale_data["Price"] rent_price = region[filter_date].rsplit(".")[0] rent_price = int(rent_price) if rent_price else 0 if sale_price and rent_price: roi_data = sale_data.copy() roi_data["RegionID"] = region_id roi_data["Rent"] = rent_price # Calculate ROI Percentage roi_data["ROI"] = round(12 * 100 * rent_price / sale_price, 2) results.append(roi_data) # Sort from the highest ROI to the lowest return sorted(results, key=lambda x: x.get("ROI"), reverse=True) .. _mutable-step: Mutable Step ============ The |FRAMEWORK_NAME| reduces the amount of executing code during the de-duplication process by executing unique calls only once. When a divergence is detected, the |FRAMEWORK_NAME| will determine how to push the current data into the other divergent branches. To ensure that one branch does not interfere with another branch, a memory-intensive operation is performed to clone the objects for each branch. However this is not always required if a protected attribute is not being modified. .. list-table:: Protected Attributes :widths: 30 15 55 :header-rows: 1 * - Name - Type - Description * - ``result`` - object - Current result within the pipeline * - ``metadata`` - dict - Metadata associated with the collection * - ``error_data`` - object - Information for errors that occurred * - ``rmd_data`` - OrderedDict - Data related to RequestMoreData iterations By marking a custom |STEP_NAME| as ``mutable=False``, the memory-intensive cloning will not occur and offer a memory and speed boost to the |FRAMEWORK_NAME|. When a |STEP_NAME| is not marked as ``mutable``, the |FRAMEWORK_NAME| will copy the reference of each protected attributes rather than cloning the attributes. To safely mark a |STEP_NAME| as ``mutable``, the |STEP_NAME| must not update the reference of a protected attribute. ``mutable`` should be defined as a boolean during the registration process. If it is not specified, the |STEP_NAME| is assumed to be ``mutable``. An example of a |STEP_NAME| that should be marked as ``mutable``: .. code-block:: python :emphasize-lines: 5,10 @register_processor( metadata={ "author": "ScienceLogic", "title": "Pop Metadata Key", "mutable": True }, ) def pop_metadata_key(step_args, metadata): # This step is mutable as `.pop` modifies the metadata reference return metadata.pop(step_args) Since the ``pop`` method of a dict will remove a given key and alter the *reference* to the *metadata* dict in **memory**, this |STEP_NAME| *mutates* the data and should not be considered as ``mutable``.