The traditional way to monitor power consumption in a datacenter is to use smart socket outlets, Power Distribution Units (PDU).

To each, you can connect over the network or through the console, see the consumption on a segment or on a separate outlet, excellent opportunities. Manufacturers attach the appropriate software packages or even entire specialized servers, such as
SPM from Server Technology .
')
You can draw a beautiful diagram of the datacenter, the location of the servers and build different graphics. The price tag is attached to them appropriate, inhuman.
You can watch how much the data center consumes, but that’s all. What if you want to manage consumption?
Use the built-in server features!
How the server monitors and controls power consumptionModern server allows you to watch power consumption through IPMI tools:
~> sudo ipmitool sdr Power Unit | 150W | ns
Either through the raw command, but also possible.
Such an opportunity appeared thanks to the Intel Node Manager technology, which was introduced with the Nehalem platform, the Xeon 5500 series. In turn, it relies on the capabilities of smart power supplies and control via the PMbus bus.

The basis of the server side are the capabilities of the Intel Management Engine and BMC (
OCP does without BMC ):
Server Service Bus ConnectionsThe power consumption of the server is controlled by changing the power states and throttling of the processor (P-states and T-states), it is also possible to put restrictions on memory consumption, but it appeared only in the last revision, together with E5 processors.
Comparison of versions NM, RMLY - Xeon E5, BRLW - Xeon E3Total that can be obtained:
- Monitoring consumption over time. Intel Intelligent Power Node Manager measures platform consumption with an acceptable tolerance of ± 10%. Data collection goes through the Power Supply Management Interface (PSMI) in real time.
- Platform power limitation, power capping. The platform can be limited to consumption "from above", setting a limit over which it will not consume. The policy is set via the IPMI / DCMI console and CPU consumption is limited through working with P-states. The limit works for processors and memory.
- Sending warnings about excess consumption. If you cannot meet the designated target budget, a warning will be sent to the management console.
In the
Open Compute Platform and
Open CloudServer rack solutions, you can limit consumption through rack controllers.
Power CappingThis is the most interesting and important technology in energy management.
In the line of Intel processors there are processors with different thermal packs, there are also special models of the L series with reduced consumption. You get a processor that never goes beyond a certain power, but it has lower performance at the same or higher cost compared to conventional processors.
Do they make sense?
As shown by
testing for anandtech - no.
Since physics cannot be deceived, the same amount of energy is required to perform the same amount of work (plus the difference due to the speed of the processor / memory transition to low-consuming states). The total consumed value will be such that during the operation of the server the difference in the cost of the processor will not pay off.
Putting Power Cap on a server with a regular processor, you get the same result as when using models with reduced consumption.
What is Power Cap for?For smoothing consumption shots. Moreover, it is convenient to limit not the server, but the whole rack. Servers are loaded unevenly, so a part can take more than the average value, a part less - the main thing is that the total does not go beyond the limits. Productivity will hardly suffer from this (except for the case when full loading of all systems and TurboBoost is required), and the rack consumption will drop. Most importantly, the consumption in the data center will become much more predictable and more systems can be put in one rack.
The rationale for this thesis clearly shows all of the same testing Anandtech. The server, where the TurboBoost processor acceleration technology works, provides a noticeably slower response time to requests compared to a machine that is strictly limited “from above”. A rack where the system load is uneven will work better, and in the case of prolonged high loads, it will give some deterioration in response time with stable system performance. In this case, space savings
according to Intel may be 20% or more.
Of course, it makes no sense to limit the processors below their thermal pack, except when the response time does not matter to you and the average processor load is significantly lower than the maximum performance, but there is a need to fit as many systems as possible in a rack.
So, you can sum up a number of advantages of using consumption management technologies:
- Increased rack density: the server’s energy budget management, depending on the actual load in the data management system, allows unused capacity to be used for additional servers in the rack
- Maximizing performance during power consumption and temperature: dynamic management of server consumption allows you to allocate more resources to critical tasks, reducing the performance of secondary
- Reduced power consumption: The cooling control system gets real-world data on power consumption and server temperature, reducing performance depending on needs.
- There is the possibility of balancing: power consumption and server temperature can be included in the cloud or virtual environment management system to balance the load between racks.

What tool is convenient to manage these functions?
Intel DCM: Energy DirectorIntel provided the DataCenter Manager product, which was split into two - DCM: Energy Director and DCM: Virtual KVM Gateway. In addition to the monitoring functionality with a web interface, it is also a tool that can be integrated into your own development environment.
Main panelDCM features and benefits: Energy Director
Easy installation
- Installation in minutes with minimal system requirements
- Scanning the network and adding devices in automatic or manual mode
- Convenient graphical interface with simple addition of new racks, rows of racks and data center rooms for visualization of infrastructure
Collects real-time consumption data
- Data collected in Out-of-Band mode.
- Does not require access to the OS on the client
- Collects statistics over a long period of time to build trends and analysis
Analysis Tools
- Identification of hot and cold zones in the data center helps reduce damage from excessive cooling systems
- Detection of underloaded systems that may take additional load or be limited in consumption
- Server power visualization to assess the impact of changes in power management policy on device consumption
History storage
- Statistics data stored year (default)
- Data on consumption and temperature are reduced to view at the level of the room, a number of racks and a separate rack.
- Data export allows you to integrate DCM with third-party analysis tools.
Alert and control system
- Generates alerts and directs them to other management tools.
- Applies consumption policies for opening a reserve of power without sacrificing performance
- Allows you to apply consumption policies to reduce the risk of supplying excess capacity to the server
Here’s how it looks live:
Adding discovered servers to datacenter rack
Creating energy policy
Type of data center
Creating thresholds for response
Optimization TipsMonitoring | Monitoring power consumption and inlet temperature with aggregation of data on racks, lines and rooms
User Physical or Logical Groups
Receive alerts for user events related to power and temperature
Energy calculation algorithm for obsolete servers that do not have power monitoring
Display of asset tags and server serial number for a number of manufacturers
|
Tracking trends | Keeping a log of power supply and temperature, requests for the calculation of trends with filtering
For capacity planning purposes, data is stored for one year.
|
Control | Intellectual patented group policy mechanism
Multiple active power policy types at the same hierarchy levels are supported simultaneously.
Load prioritization is supported as a policy
Allows you to schedule the use of policies, including the power limit on the time of day and / or day of the week
Supports power limiting by group, dynamically adapting to changing server load
Intel® Node Manager 2.0 supports memory power limiting and dynamic kernel placement |
No agent | Does not require installation of any software agents on managed nodes |
Simple integration and coexistence | The system of accounting devices pre-scans used range of IP addresses
High-level Web Services Description Languages ​​(WSDL) application programming interfaces stand out
Can be on an independent management server or coexist on the same server with an ISV product
Power planning with consideration of temperature: simulation of input and output temperatures (depending on the manufacturer)
Thermal sensor on the air exhaust (depending on the manufacturer) |
Scalability | Ability to manage tens of thousands of servers |
Security | Protected APIs
Secure communication channels with managed nodes
Encrypt all sensitive data |
DCM: Energy Director works on all servers of our production and is available for trial and combat use.