Having recently read the article
“We are building a package for Solaris from sorts”, I realized that the SMF functionality was not covered at all in Habré.
Let's correct this situation and see what SMF is and what advantages it gives to administrators.
Introduction
Service Management Facility (SMF) is a service management system that appeared in Solaris 10. SMF allows you to more flexibly manage processes, assign dependencies to them, and restart if necessary. In addition, SMF allows you to delegate service management rights to regular (non-root) users.
To manage SMF, “all” of the three commands are enough:
- svcs - checks the status of services,
- svcadm - service state management,
- svccfg - setting service parameters.
Let's try to figure out how to manage SMF using the example of adding your own service.
I recently needed nginx under Solaris, I had to build a package and integrate into a common system of services - using his example and see how a service can be designed to manage through SMF.
Add service
To integrate the service into the SMF for it you need to write a manifest - XML file with a description of dependencies, launch methods and other parameters. For elementary services, this is sufficient; for more complex, you also need a startup script (similar to /etc/init.d/service).
SMF Manifest Content
<? xml version = "1.0"?>
<! DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">
<service_bundle type = 'manifest' name = 'nginx'>
<service
name = 'network / nginx'
type = 'service'
version = '1'>
<create_default_instance enabled = 'false' />
<single_instance />
<dependency name = 'loopback'
grouping = 'require_all'
restart_on = 'error'
type = 'service'>
<service_fmri value = 'svc: / network / loopback: default' />
</ dependency>
<dependency name = 'physical'
grouping = 'optional_all'
restart_on = 'error'
type = 'service'>
<service_fmri value = 'svc: / network / physical: default' />
</ dependency>
<dependency name = 'multiuser-server'
grouping = 'require_all'
restart_on = 'error'
type = 'service'>
<service_fmri value = 'svc: / milestone / multi-user-server: default' />
</ dependency>
<exec_method
type = 'method'
name = 'start'
exec = '/ opt / nginx / svc / nginx start'
timeout_seconds = '60 '/>
<exec_method
type = 'method'
name = 'stop'
exec = ': kill -QUIT'
timeout_seconds = '60 '/>
<exec_method
type = 'method'
name = 'refresh'
exec = '/ opt / nginx / svc / nginx refresh'
timeout_seconds = '60 '/>
<property_group name = 'nginx' type = 'application'>
<propval name = 'config' type = 'astring'
value = '/ opt / nginx / etc / nginx.conf' />
<propval name = 'pid' type = 'astring'
value = '/ opt / nginx / var / run / nginx.pid' />
</ property_group>
<property_group name = 'startd' type = 'framework'>
<! - core process dumps shouldn't restart
session ->
<propval name = 'ignore_error' type = 'astring'
value = 'core, signal' />
</ property_group>
<template>
<common_name>
<loctext xml: lang = 'C'>
Nginx HTTP server
</ loctext>
</ common_name>
<documentation>
<manpage title = 'nginx' section = '1M' />
<doc_link name = 'nginx.org'
uri = 'http: //www.nginx.org/' />
</ documentation>
</ template>
</ service>
</ service_bundle>
We analyze in order:
- service_bundle - shows how this file should be handled by SMF. Possible values for type “arhive”, “manifest”, “profile”. In our case, only “manifest” is considered. The name attribute contains the name of the service;
- service — contains a set of service instances (each service can have several instances that differ in configuration), dependencies, management methods, and configuration parameters. Attributes name, version and type contain the name, version and type of service respectively. The type can be one of “service”, “restarter”, “milestone”.
There are a number of conventions for naming a service (attribute name). They are not obligatory for performance, but facilitate the general perception. There are a number of standard categories (system, application, network, etc.) that are appended to the name through a slash (network / nginx). Also, several categories are allowed, for example, for types separation (application / database / mysql); - create_default_instance and single_instance tell us that the service has only one instance and needs to be created in the off state (enabled = 'false');
- dependency - describes service dependencies. All dependencies are grouped by type (groupping attribute). Services from the require_all group must all be online for the service to start. The “require_any” group requires any service from those described, “exclude_all” excludes all specified services, and “optional_all” simply (as I understand it) requires that the service be started after all optional_all dependencies, since it still waits for them to load or exit with an error or were turned off.
The type of restart of the service from the dependency is defined by the restart_on attribute and includes the following values: “error” - restart if the dependency was rebooted due to a hardware error, “restart” - restart if the dependency was restarted for any reason (including a hardware error), “refresh” - restart if the dependency is rebooted or updated. The value “none” prohibits restarting the service in spite of the dependency status; - exec_method - various methods for service management. Methods start / stop are called when you enable (enable), disable (disable) and restart (restart) the service. The refresh method is called when refreshing the service (refresh);
- property_group - defines a set of configuration parameters for the service. The type “framework” is responsible for the configuration of the SMF parameters, and the “application” for the parameters of the service itself. In our case, the startd / ignore_error parameter is passed, which explains the restarter (the restarter variant) that the service needs to be restarted only if all its processes are out and ignore the "bark" and "deadly signals". The parameters in the “application” group are used to configure the instance;
- template - an optional tag that contains meta information about the service.
As you can see, the manifest itself describes the service in some detail and also refers to the external script / opt / nginx / svc / nginx which is responsible, in fact, for starting the service. We now analyze it:
#! / sbin / sh
#
. /lib/svc/share/smf_include.sh
# SMF_FMRI is the name of the target service. This allows multiple instances
# to use the same script.
if [-z $ SMF_FMRI]; then
echo "SMF framework variables are not initialized."
exit $ SMF_EXIT_ERR
fi
getproparg () {
val = `svcprop -p $ 1 $ SMF_FMRI`
[-n "$ val"] && echo $ val
}
NGINX_HOME = / opt / nginx
HTTPD = "$ {NGINX_HOME} / sbin / nginx"
CONF_FILE = `getproparg nginx / config`
PIDFILE = `getproparg nginx / pid`
if [-z $ CONF_FILE]; then
echo "nginx / config property is not set"
exit $ SMF_EXIT_ERR_CONFIG
fi
if [-z $ PIDFILE]; then
echo "nginx / pid property is not set"
exit $ SMF_EXIT_ERR_CONFIG
fi
if [! -f $ {CONF_FILE}]; then
echo "nginx / config: could not find config file"
exit $ SMF_EXIT_ERR_CONFIG
fi
case "$ 1" in
start)
$ HTTPD -t -c $ {CONF_FILE} 2> & 1
if [$? -ne 0]; then
exit $ SMF_EXIT_ERR_CONFIG
fi
$ HTTPD -c $ {CONF_FILE} 2> & 1
;;
refresh)
if [-f "$ PIDFILE"]; then
/ usr / bin / kill -HUP `/ usr / bin / cat $ PIDFILE`
fi
;;
stop)
if [-f "$ PIDFILE"]; then
/ usr / bin / kill -KILL `/ usr / bin / cat $ PIDFILE`
fi
;;
*)
echo "Usage: $ 0 {start | stop | refresh}"
exit 1
;;
esac
exit $ SMF_EXIT_OK
This is a regular init.d script with some helper functions. At the very beginning, the file with system variables is included, which will help us in the future. One of them is SMF_FRMI - it contains the full name of the service. SMF_FRMI is used to get configuration parameters from the manifest (auxiliary function getproparg). The advantage of this approach will be obvious later when we consider different instances of the same service.
')
Now that we have a complete set of files, we will make them visible to the system:
# svccfg -v import nginx.xml
svccfg: Taking "initial" snapshot for svc: / network / nginx: default.
svccfg: Taking "last-import" snapshot for svc: / network / nginx: default.
svccfg: Refreshed svc: / network / nginx: default.
svccfg: Successful import.
We have a network / nginx service with one default instance, since the manifest was written to “create an instance by default”. Also this instance should be created in the off state, check:
# svcs nginx
STATE STIME FMRI
disabled 19:11:07 svc: / network / nginx: default
Management teams do not have to pass the full FRMI (English Fault Managed Resource Identifier) service, just the name (for example, nginx) is enough, but if there are several services with the same name, but in different categories, then you need to specify the full name. You also need to specify an instance (nginx: default) if the service has more than one instance (the svcs command with incomplete FRMI will display the status of all services that fall under the comparison).
The service in the disabled state will not rise when the OS boots. Therefore, it must be enabled:
# svcadm -v enable nginx
svc: / network / nginx: default enabled.
Check that the service is running:
# svcs nginx
STATE STIME FMRI
maintenance 19:26:28 svc: / network / nginx: default
We will be disappointed - instead of online status, we have maintenance. Maintenance status corresponds to some error in the service. The SMF puts the service into this state if the start method returns a value other than OK or an attempt to stop the service failed three times in a row. Let's see what was the reason in our case. To do this, see the extended status of the service:
# svcs -x nginx
svc: / network / nginx: default (Nginx HTTP server)
State: maintenance since Thu Mar 24 19:26:28 2011
Reason: Start method exited with $ SMF_EXIT_ERR_CONFIG.
See: http://sun.com/msg/SMF-8000-KS
See: nginx (1M)
See: /var/svc/log/network-nginx:default.log
Impact: This service is not running.
As you can see, the extended status also contains the description from the template manifest. We are shown that the starting method ended with an error and indicate the log where you can see more detailed. The log shows the following:
[Mar 24 19:26:28 Enabled. ]
[Mar 24 19:26:28 Executing start method ("/ opt / nginx / svc / nginx start")]
nginx / config: could not find config file
[Mar 24 19:26:28 Method "start" exited with status 96]
Our startup script says that it cannot find the configuration file. Yes, I trivially forgot to create it. Let's try to restart the service after creating the file. It should be noted that the maintenance state does not affect the enabled / disabled service - this is a temporary stop. Therefore, we just need to “clean” the service, tell the system that we have fixed it:
# svcadm clear nginx
# svcs -x nginx
svc: / network / nginx: default (Nginx HTTP server)
State: online since Thu Mar 24 19:40:03 2011
See: nginx (1M)
See: /var/svc/log/network-nginx:default.log
Impact: None.
# ps -fe | grep nginx
root 5864 1 0 19:40:04? 0:00 / opt / nginx / sbin / nginx -c /opt/nginx/etc/nginx.conf
nobody 5865 5864 0 19:40:04? 0:00 / opt / nginx / sbin / nginx -c /opt/nginx/etc/nginx.conf
The service has started and successfully performs its functions. In this case, the log will contain the following:
[Mar 24 19:40:03 Leaving maintenance because clear requested. ]
[Mar 24 19:40:03 Enabled. ]
[Mar 24 19:40:03 Executing start method ("/ opt / nginx / svc / nginx start")]
the configuration file /opt/nginx/etc/nginx.conf syntax is ok
configuration file /opt/nginx/etc/nginx.conf test is successful
[Mar 24 19:40:03 Method "start" exited with status 0]
Now, if the service dies unexpectedly, SMF will automatically restart it (according to the parameters startd / ignore_error). Create this situation with kill -9 and see the logs:
[Mar 24 19:42:25 Stopping because all processes in service exited. ]
[Mar 24 19:42:25 Executing stop method (: kill)]
[Mar 24 19:42:25 Executing start method ("/ opt / nginx / svc / nginx start")]
the configuration file /opt/nginx/etc/nginx.conf syntax is ok
configuration file /opt/nginx/etc/nginx.conf test is successful
[Mar 24 19:42:25 Method "start" exited with status 0]
Additional features
So, we have a service that is monitored at the OS level. But if we need two or three identical services (for example, several postgresql servers or nginx with different tasks) then what, to write a bunch of manifests? What then is the benefit?
Here we will have the opportunity to create several instances of the same service. To do this, we need to remove the create_default_instance and single_service tags in the manifest, explicitly create our instance and transfer the unique parameters there:
<instance name = 'default' enabled = 'false'>
<property_group name = 'nginx' type = 'application'>
<propval name = 'config' type = 'astring'
value = '/ opt / nginx / etc / nginx.conf' />
<propval name = 'pid' type = 'astring'
value = '/ opt / nginx / var / run / nginx.pid' />
</ property_group>
<property_group name = 'startd' type = 'framework'>
<! - core process dumps shouldn't restart
session ->
<propval name = 'ignore_error' type = 'astring'
value = 'core, signal' />
</ property_group>
</ instance>
Need to perezalit manifest through imports. The SMF will determine what has changed:
# svccfg -v import nginx.xml
svccfg: Taking "previous" snapshot for svc: / network / nginx: default.
svccfg: Upgrading properties of svc: / network / nginx according to instance "default".
svccfg: svc: / network / nginx: Deleting property group "nginx".
svccfg: svc: / network / nginx: Deleting property group "general".
svccfg: svc: / network / nginx: Deleting property group "startd".
svccfg: Taking "last-import" snapshot for svc: / network / nginx: default.
svccfg: Refreshed svc: / network / nginx: default.
svccfg: Successful import.
As a result, we received the same description of the service, only now with the ability to configure several instances. It was possible to specify an additional instance directly in the manifest:
<instance name = 'monitoring' enabled = 'false'>
<property_group name = 'nginx' type = 'application'>
<propval name = 'config' type = 'astring'
value = '/ opt / nginx / etc / nginx-munin.conf' />
<propval name = 'pid' type = 'astring'
value = '/ opt / nginx / var / run / nginx-munin.pid' />
</ property_group>
<property_group name = 'startd' type = 'framework'>
<! - core process dumps shouldn't restart
session ->
<propval name = 'ignore_error' type = 'astring'
value = 'core, signal' />
</ property_group>
</ instance>
And import:
# svccfg -v import nginx.xml
svccfg: Taking "previous" snapshot for svc: / network / nginx: default.
svccfg: Taking "previous" snapshot for new service svc: / network / nginx: monitoring.
svccfg: Upgrading properties of svc: / network / nginx according to instance "default".
svccfg: Taking "initial" snapshot for svc: / network / nginx: monitoring.
svccfg: Taking "last-import" snapshot for svc: / network / nginx: monitoring.
svccfg: Taking "last-import" snapshot for svc: / network / nginx: default.
svccfg: Refreshed svc: / network / nginx: monitoring.
svccfg: Refreshed svc: / network / nginx: default.
svccfg: Successful import.
Now we have two service instances (note that the default instance remains on):
# svcs nginx
STATE STIME FMRI
disabled 20:16:31 svc: / network / nginx: monitoring
online 20:16:31 svc: / network / nginx: default
Now, to start the service, the instance will have to be specified explicitly, otherwise the system will warn us:
# svcadm enable nginx
svcadm: Pattern 'nginx' matches multiple instances:
svc: / network / nginx: monitoring
svc: / network / nginx: default
# svcadm -v enable nginx: monitoring
svc: / network / nginx: monitoring enabled.
You can also add an instance without editing the manifest (but the manifest should already be configured for several instances and contain default) using the svccfg command, which is used to change the parameters of the service. Roughly speaking, after importing the manifest, the source file no longer plays any role, since it is imported into the SMF database. To get the manifest with the current service settings, you can use the svccfg export command. Adding instances on the fly automates the process:
# svccfg -s nginx add phpfpm
# svccfg -s nginx: phpfpm addpg nginx application
# svccfg -s nginx: phpfpm setprop nginx / config = astring: /opt/nginx/etc/fpm.conf
# svccfg -s nginx: phpfpm setprop nginx / pid = astring: /opt/nginx/run/fpm.pid
# svcadm disable nginx: phpfpm # This will add system properties automatically
# svcs nginx
STATE STIME FMRI
disabled 20:37:30 svc: / network / nginx: phpfpm
online 20:16:31 svc: / network / nginx: default
online 20:21:09 svc: / network / nginx: monitoring
If you need to start the service not from the root user, then everything is possible. Just add
<method_context> <method_credential user = 'munin' group = 'munin' /> </ method_context>
In the description of an instance or a separate method (if you only need to run a method from another user):
# ps -fe | grep nginx
munin 6254 1 0 21:10:52? 0:00 / opt / nginx / sbin / nginx -c /opt/nginx/etc/nginx-munin.conf
munin 6255 6254 0 21:10:52? 0:00 / opt / nginx / sbin / nginx -c /opt/nginx/etc/nginx-munin.conf
root 5884 1 0 19:42:25? 0:00 / opt / nginx / sbin / nginx -c /opt/nginx/etc/nginx.conf
nobody 6015 5884 0 21:05:04? 0:00 / opt / nginx / sbin / nginx -c /opt/nginx/etc/nginx.conf
Using similar tools, you can start services in separate projects (projects - resource constraints). And if you also want not only root to manage services, then SMF integrates with Solaris RBAC!
The user can be assigned both global roles to change any methods, dependencies, parameters in the “application” / “framework” groups, and permissions to specific groups. For each group, you can assign specific attributes (property value) modify_authorization, value_authorization, action_authorization, in which you must write the necessary "authorization" for the operation.
- action_authorization - I have seen usage only in the “general” group of the instance. The attribute attribute is stored in the group, and this “authorization” allows you to perform actions with the service without writing any data to the manifest. For example, refresh, restart, set / clear maintenance
- value_authorization - allows you to change the values of attributes in a group, but not add / delete attributes. If you add this authorization to the general group, it will allow the user to change the enabled attribute, so he will be able to enable / disable the service. If you add this authorization to a group, say, nginx, this will allow the user to change the path to the configuration file.
- modify_authorization - allows you to change, add and delete attributes in the group.
For example, I will show the use of action / value authorization to control a service.
We will first add some “authorization” to the user, I will take a short line, but there is an agreement under which authorization should be called meaningfully (for example solaris.smf.manage.nginx / monitoring). (There are also a number of predefined authorizations in / etc / security / auth_attr, but RBAC is the subject of a separate large article):
# echo "solaris.munin ::: Munin authrization ::" >> / etc / security / auth_attr
# usermod -A solaris.munin munin
While we have not configured the service and the user can not do anything:
munin @ sol2 $ / usr / sbin / svcadm restart nginx: monitoring
svcadm: svc: / network / nginx: monitoring: Permission denied.
munin @ sol2 $ / usr / sbin / svcadm disable nginx: monitoring
svcadm: svc: / network / nginx: monitoring: Permission denied.
Add the ability to reboot the service for "authorization" solaris.munin:
# svccfg -s nginx: monitoring setprop general / action_authorization = astring: solaris.munin
# svcadm refresh nginx: monitoring
(After any changes in the manifest, refresh the service so that the SMF reads the configuration for the current service)
Checking:
munin @ sol2 $ / usr / sbin / svcadm -v restart nginx: monitoring
Action restart set for svc: / network / nginx: monitoring.
munin @ sol2 $ / usr / sbin / svcadm -v disable nginx: monitoring
svcadm: svc: / network / nginx: monitoring: Could not modify "general" property group (permission denied).
It can be seen that the user can restart the service, but change the property in the SMF database is not. This approach allows the user to restart "their" services when changing the configuration file. If the user needs to give the opportunity to permanently stop the service, then we will give him the opportunity to modify the “general” property group:
# svccfg -s nginx: monitoring setprop general / value_authorization = astring: solaris.munin
# svcadm refresh nginx: monitoring
Checking:
munin @ sol2 $ / usr / sbin / svcadm -v disable nginx: monitoring
svc: / network / nginx: monitoring disabled.
munin @ sol2 $ / usr / sbin / svcadm -v enable nginx: monitoring
svc: / network / nginx: monitoring enabled.
Using the above techniques, you can create a convenient system for users, while flexibly delimiting their rights.
I hope that this article will help someone to better understand what SMF is, and even encourage them to learn more about SMF, RBAC and Solaris.