📜 ⬆️ ⬇️

SMF - Service Management on Solaris

Having recently read the article “We are building a package for Solaris from sorts”, I realized that the SMF functionality was not covered at all in Habré.
Let's correct this situation and see what SMF is and what advantages it gives to administrators.

Introduction


Service Management Facility (SMF) is a service management system that appeared in Solaris 10. SMF allows you to more flexibly manage processes, assign dependencies to them, and restart if necessary. In addition, SMF allows you to delegate service management rights to regular (non-root) users.
To manage SMF, “all” of the three commands are enough:

Let's try to figure out how to manage SMF using the example of adding your own service.
I recently needed nginx under Solaris, I had to build a package and integrate into a common system of services - using his example and see how a service can be designed to manage through SMF.

Add service


To integrate the service into the SMF for it you need to write a manifest - XML ​​file with a description of dependencies, launch methods and other parameters. For elementary services, this is sufficient; for more complex, you also need a startup script (similar to /etc/init.d/service).

SMF Manifest Content

 <? xml version = "1.0"?> 
 <! DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1"> 

 <service_bundle type = 'manifest' name = 'nginx'> 

 <service 
         name = 'network / nginx' 
         type = 'service' 
         version = '1'> 

         <create_default_instance enabled = 'false' /> 
         <single_instance /> 

         <dependency name = 'loopback' 
             grouping = 'require_all' 
             restart_on = 'error' 
             type = 'service'> 
                 <service_fmri value = 'svc: / network / loopback: default' /> 
         </ dependency> 

         <dependency name = 'physical' 
             grouping = 'optional_all' 
             restart_on = 'error' 
             type = 'service'> 
                 <service_fmri value = 'svc: / network / physical: default' /> 
         </ dependency> 

         <dependency name = 'multiuser-server' 
             grouping = 'require_all' 
             restart_on = 'error' 
             type = 'service'> 
                 <service_fmri value = 'svc: / milestone / multi-user-server: default' /> 
         </ dependency> 

         <exec_method 
             type = 'method' 
             name = 'start' 
             exec = '/ opt / nginx / svc / nginx start' 
             timeout_seconds = '60 '/> 

         <exec_method 
             type = 'method' 
             name = 'stop' 
             exec = ': kill -QUIT' 
             timeout_seconds = '60 '/> 

         <exec_method 
             type = 'method' 
             name = 'refresh' 
             exec = '/ opt / nginx / svc / nginx refresh' 
             timeout_seconds = '60 '/> 

         <property_group name = 'nginx' type = 'application'> 
                 <propval name = 'config' type = 'astring' 
                     value = '/ opt / nginx / etc / nginx.conf' /> 
                 <propval name = 'pid' type = 'astring' 
                     value = '/ opt / nginx / var / run / nginx.pid' /> 
         </ property_group> 

         <property_group name = 'startd' type = 'framework'> 
                 <! - core process dumps shouldn't restart 
                      session -> 
                 <propval name = 'ignore_error' type = 'astring' 
                          value = 'core, signal' /> 
         </ property_group> 

         <template> 
                 <common_name> 
                         <loctext xml: lang = 'C'> 
                                 Nginx HTTP server 
                         </ loctext> 
                 </ common_name> 
                 <documentation> 
                         <manpage title = 'nginx' section = '1M' /> 
                         <doc_link name = 'nginx.org' 
                                 uri = 'http: //www.nginx.org/' /> 
                 </ documentation> 
         </ template> 
 </ service> 
 </ service_bundle>

We analyze in order:

As you can see, the manifest itself describes the service in some detail and also refers to the external script / opt / nginx / svc / nginx which is responsible, in fact, for starting the service. We now analyze it:
 #! / sbin / sh
 #

 .  /lib/svc/share/smf_include.sh

 # SMF_FMRI is the name of the target service.  This allows multiple instances
 # to use the same script.

 if [-z $ SMF_FMRI];  then
 echo "SMF framework variables are not initialized."
 exit $ SMF_EXIT_ERR
 fi

 getproparg () {
 val = `svcprop -p $ 1 $ SMF_FMRI`
 [-n "$ val"] && echo $ val
 }

 NGINX_HOME = / opt / nginx
 HTTPD = "$ {NGINX_HOME} / sbin / nginx"
 CONF_FILE = `getproparg nginx / config`
 PIDFILE = `getproparg nginx / pid`

 if [-z $ CONF_FILE];  then
 echo "nginx / config property is not set"
 exit $ SMF_EXIT_ERR_CONFIG
 fi

 if [-z $ PIDFILE];  then
 echo "nginx / pid property is not set"
 exit $ SMF_EXIT_ERR_CONFIG
 fi

 if [!  -f $ {CONF_FILE}];  then
 echo "nginx / config: could not find config file"
 exit $ SMF_EXIT_ERR_CONFIG
 fi

 case "$ 1" in
 start)
         $ HTTPD -t -c $ {CONF_FILE} 2> & 1
         if [$?  -ne 0];  then
                 exit $ SMF_EXIT_ERR_CONFIG
         fi
         $ HTTPD -c $ {CONF_FILE} 2> & 1
         ;;
 refresh)
         if [-f "$ PIDFILE"];  then
                 / usr / bin / kill -HUP `/ usr / bin / cat $ PIDFILE`
         fi
         ;;
 stop)
         if [-f "$ PIDFILE"];  then
                 / usr / bin / kill -KILL `/ usr / bin / cat $ PIDFILE`
         fi
         ;;
 *)
         echo "Usage: $ 0 {start | stop | refresh}"
         exit 1
         ;;
 esac

 exit $ SMF_EXIT_OK

This is a regular init.d script with some helper functions. At the very beginning, the file with system variables is included, which will help us in the future. One of them is SMF_FRMI - it contains the full name of the service. SMF_FRMI is used to get configuration parameters from the manifest (auxiliary function getproparg). The advantage of this approach will be obvious later when we consider different instances of the same service.
')
Now that we have a complete set of files, we will make them visible to the system:
 # svccfg -v import nginx.xml
 svccfg: Taking "initial" snapshot for svc: / network / nginx: default.
 svccfg: Taking "last-import" snapshot for svc: / network / nginx: default.
 svccfg: Refreshed svc: / network / nginx: default.
 svccfg: Successful import.

We have a network / nginx service with one default instance, since the manifest was written to “create an instance by default”. Also this instance should be created in the off state, check:
 # svcs nginx
 STATE STIME FMRI
 disabled 19:11:07 svc: / network / nginx: default

Management teams do not have to pass the full FRMI (English Fault Managed Resource Identifier) ​​service, just the name (for example, nginx) is enough, but if there are several services with the same name, but in different categories, then you need to specify the full name. You also need to specify an instance (nginx: default) if the service has more than one instance (the svcs command with incomplete FRMI will display the status of all services that fall under the comparison).

The service in the disabled state will not rise when the OS boots. Therefore, it must be enabled:
 # svcadm -v enable nginx
 svc: / network / nginx: default enabled.

Check that the service is running:
 # svcs nginx
 STATE STIME FMRI
 maintenance 19:26:28 svc: / network / nginx: default

We will be disappointed - instead of online status, we have maintenance. Maintenance status corresponds to some error in the service. The SMF puts the service into this state if the start method returns a value other than OK or an attempt to stop the service failed three times in a row. Let's see what was the reason in our case. To do this, see the extended status of the service:
 # svcs -x nginx
 svc: / network / nginx: default (Nginx HTTP server)
  State: maintenance since Thu Mar 24 19:26:28 2011
 Reason: Start method exited with $ SMF_EXIT_ERR_CONFIG.
    See: http://sun.com/msg/SMF-8000-KS
    See: nginx (1M)
    See: /var/svc/log/network-nginx:default.log
 Impact: This service is not running.

As you can see, the extended status also contains the description from the template manifest. We are shown that the starting method ended with an error and indicate the log where you can see more detailed. The log shows the following:
 [Mar 24 19:26:28 Enabled.  ]
 [Mar 24 19:26:28 Executing start method ("/ opt / nginx / svc / nginx start")]
 nginx / config: could not find config file
 [Mar 24 19:26:28 Method "start" exited with status 96]

Our startup script says that it cannot find the configuration file. Yes, I trivially forgot to create it. Let's try to restart the service after creating the file. It should be noted that the maintenance state does not affect the enabled / disabled service - this is a temporary stop. Therefore, we just need to “clean” the service, tell the system that we have fixed it:
 # svcadm clear nginx
 # svcs -x nginx
 svc: / network / nginx: default (Nginx HTTP server)
  State: online since Thu Mar 24 19:40:03 2011
    See: nginx (1M)
    See: /var/svc/log/network-nginx:default.log
 Impact: None.

 # ps -fe |  grep nginx
     root 5864 1 0 19:40:04?  0:00 / opt / nginx / sbin / nginx -c /opt/nginx/etc/nginx.conf
   nobody 5865 5864 0 19:40:04?  0:00 / opt / nginx / sbin / nginx -c /opt/nginx/etc/nginx.conf

The service has started and successfully performs its functions. In this case, the log will contain the following:
 [Mar 24 19:40:03 Leaving maintenance because clear requested.  ]
 [Mar 24 19:40:03 Enabled.  ]
 [Mar 24 19:40:03 Executing start method ("/ opt / nginx / svc / nginx start")]
 the configuration file /opt/nginx/etc/nginx.conf syntax is ok
 configuration file /opt/nginx/etc/nginx.conf test is successful
 [Mar 24 19:40:03 Method "start" exited with status 0]

Now, if the service dies unexpectedly, SMF will automatically restart it (according to the parameters startd / ignore_error). Create this situation with kill -9 and see the logs:
 [Mar 24 19:42:25 Stopping because all processes in service exited.  ]
 [Mar 24 19:42:25 Executing stop method (: kill)]
 [Mar 24 19:42:25 Executing start method ("/ opt / nginx / svc / nginx start")]
 the configuration file /opt/nginx/etc/nginx.conf syntax is ok
 configuration file /opt/nginx/etc/nginx.conf test is successful
 [Mar 24 19:42:25 Method "start" exited with status 0]


Additional features

So, we have a service that is monitored at the OS level. But if we need two or three identical services (for example, several postgresql servers or nginx with different tasks) then what, to write a bunch of manifests? What then is the benefit?
Here we will have the opportunity to create several instances of the same service. To do this, we need to remove the create_default_instance and single_service tags in the manifest, explicitly create our instance and transfer the unique parameters there:
 <instance name = 'default' enabled = 'false'>
   <property_group name = 'nginx' type = 'application'> 
      <propval name = 'config' type = 'astring' 
               value = '/ opt / nginx / etc / nginx.conf' /> 
      <propval name = 'pid' type = 'astring' 
               value = '/ opt / nginx / var / run / nginx.pid' /> 
   </ property_group> 

   <property_group name = 'startd' type = 'framework'> 
      <! - core process dumps shouldn't restart 
           session -> 
      <propval name = 'ignore_error' type = 'astring' 
               value = 'core, signal' /> 
   </ property_group> 
 </ instance>

Need to perezalit manifest through imports. The SMF will determine what has changed:
 # svccfg -v import nginx.xml 
 svccfg: Taking "previous" snapshot for svc: / network / nginx: default.
 svccfg: Upgrading properties of svc: / network / nginx according to instance "default".
 svccfg: svc: / network / nginx: Deleting property group "nginx".
 svccfg: svc: / network / nginx: Deleting property group "general".
 svccfg: svc: / network / nginx: Deleting property group "startd".
 svccfg: Taking "last-import" snapshot for svc: / network / nginx: default.
 svccfg: Refreshed svc: / network / nginx: default.
 svccfg: Successful import.

As a result, we received the same description of the service, only now with the ability to configure several instances. It was possible to specify an additional instance directly in the manifest:
 <instance name = 'monitoring' enabled = 'false'>
   <property_group name = 'nginx' type = 'application'> 
      <propval name = 'config' type = 'astring' 
               value = '/ opt / nginx / etc / nginx-munin.conf' /> 
      <propval name = 'pid' type = 'astring' 
               value = '/ opt / nginx / var / run / nginx-munin.pid' /> 
   </ property_group> 

   <property_group name = 'startd' type = 'framework'> 
      <! - core process dumps shouldn't restart 
           session -> 
      <propval name = 'ignore_error' type = 'astring' 
               value = 'core, signal' /> 
   </ property_group> 
 </ instance>

And import:
 # svccfg -v import nginx.xml 
 svccfg: Taking "previous" snapshot for svc: / network / nginx: default.
 svccfg: Taking "previous" snapshot for new service svc: / network / nginx: monitoring.
 svccfg: Upgrading properties of svc: / network / nginx according to instance "default".
 svccfg: Taking "initial" snapshot for svc: / network / nginx: monitoring.
 svccfg: Taking "last-import" snapshot for svc: / network / nginx: monitoring.
 svccfg: Taking "last-import" snapshot for svc: / network / nginx: default.
 svccfg: Refreshed svc: / network / nginx: monitoring.
 svccfg: Refreshed svc: / network / nginx: default.
 svccfg: Successful import.

Now we have two service instances (note that the default instance remains on):
 # svcs nginx
 STATE STIME FMRI
 disabled 20:16:31 svc: / network / nginx: monitoring
 online 20:16:31 svc: / network / nginx: default

Now, to start the service, the instance will have to be specified explicitly, otherwise the system will warn us:
 # svcadm enable nginx
 svcadm: Pattern 'nginx' matches multiple instances:
	 svc: / network / nginx: monitoring
	 svc: / network / nginx: default

 # svcadm -v enable nginx: monitoring
 svc: / network / nginx: monitoring enabled.

You can also add an instance without editing the manifest (but the manifest should already be configured for several instances and contain default) using the svccfg command, which is used to change the parameters of the service. Roughly speaking, after importing the manifest, the source file no longer plays any role, since it is imported into the SMF database. To get the manifest with the current service settings, you can use the svccfg export command. Adding instances on the fly automates the process:
 # svccfg -s nginx add phpfpm
 # svccfg -s nginx: phpfpm addpg nginx application
 # svccfg -s nginx: phpfpm setprop nginx / config = astring: /opt/nginx/etc/fpm.conf
 # svccfg -s nginx: phpfpm setprop nginx / pid = astring: /opt/nginx/run/fpm.pid
 # svcadm disable nginx: phpfpm # This will add system properties automatically
 # svcs nginx
 STATE STIME FMRI
 disabled 20:37:30 svc: / network / nginx: phpfpm
 online 20:16:31 svc: / network / nginx: default
 online 20:21:09 svc: / network / nginx: monitoring

If you need to start the service not from the root user, then everything is possible. Just add
  <method_context> <method_credential user = 'munin' group = 'munin' /> </ method_context> 
In the description of an instance or a separate method (if you only need to run a method from another user):
 # ps -fe |  grep nginx
   munin 6254 1 0 21:10:52?  0:00 / opt / nginx / sbin / nginx -c /opt/nginx/etc/nginx-munin.conf
   munin 6255 6254 0 21:10:52?  0:00 / opt / nginx / sbin / nginx -c /opt/nginx/etc/nginx-munin.conf 
   root 5884 1 0 19:42:25?  0:00 / opt / nginx / sbin / nginx -c /opt/nginx/etc/nginx.conf
   nobody 6015 5884 0 21:05:04?  0:00 / opt / nginx / sbin / nginx -c /opt/nginx/etc/nginx.conf

Using similar tools, you can start services in separate projects (projects - resource constraints). And if you also want not only root to manage services, then SMF integrates with Solaris RBAC!
The user can be assigned both global roles to change any methods, dependencies, parameters in the “application” / “framework” groups, and permissions to specific groups. For each group, you can assign specific attributes (property value) modify_authorization, value_authorization, action_authorization, in which you must write the necessary "authorization" for the operation.

We will first add some “authorization” to the user, I will take a short line, but there is an agreement under which authorization should be called meaningfully (for example solaris.smf.manage.nginx / monitoring). (There are also a number of predefined authorizations in / etc / security / auth_attr, but RBAC is the subject of a separate large article):
 # echo "solaris.munin ::: Munin authrization ::" >> / etc / security / auth_attr
 # usermod -A solaris.munin munin

While we have not configured the service and the user can not do anything:
 munin @ sol2 $ / usr / sbin / svcadm restart nginx: monitoring
 svcadm: svc: / network / nginx: monitoring: Permission denied.
 munin @ sol2 $ / usr / sbin / svcadm disable nginx: monitoring
 svcadm: svc: / network / nginx: monitoring: Permission denied.

Add the ability to reboot the service for "authorization" solaris.munin:
 # svccfg -s nginx: monitoring setprop general / action_authorization = astring: solaris.munin
 # svcadm refresh nginx: monitoring

(After any changes in the manifest, refresh the service so that the SMF reads the configuration for the current service)

Checking:
 munin @ sol2 $ / usr / sbin / svcadm -v restart nginx: monitoring
 Action restart set for svc: / network / nginx: monitoring.
 munin @ sol2 $ / usr / sbin / svcadm -v disable nginx: monitoring
 svcadm: svc: / network / nginx: monitoring: Could not modify "general" property group (permission denied).

It can be seen that the user can restart the service, but change the property in the SMF database is not. This approach allows the user to restart "their" services when changing the configuration file. If the user needs to give the opportunity to permanently stop the service, then we will give him the opportunity to modify the “general” property group:
 # svccfg -s nginx: monitoring setprop general / value_authorization = astring: solaris.munin
 # svcadm refresh nginx: monitoring

Checking:
 munin @ sol2 $ / usr / sbin / svcadm -v disable nginx: monitoring
 svc: / network / nginx: monitoring disabled.
 munin @ sol2 $ / usr / sbin / svcadm -v enable nginx: monitoring
 svc: / network / nginx: monitoring enabled.


Using the above techniques, you can create a convenient system for users, while flexibly delimiting their rights.

I hope that this article will help someone to better understand what SMF is, and even encourage them to learn more about SMF, RBAC and Solaris.

Source: https://habr.com/ru/post/116224/


All Articles