📜 ⬆️ ⬇️

Tree Tree Comparison *

Hi, Habr!

* Actually not quite like that. When developing an information system, part of which is the various processing of design and technological documentation, I had a problem that can be briefly described as follows. Today we have one composition of the product, several changes in different parts of this product come in one day and by the evening it is not clear what has changed? Products can sometimes have more than 10,000 elements in the composition, the elements are not unique, and the reality is that the changes in composition can actively come, although the product is almost ready. Not understanding the scope of changes makes planning difficult.

The composition of the product can be represented as a tree graph. Not finding a suitable way to compare two graphs, I decided to write my bicycle .

Task


During the day, various changes from the design system come to the accounting system. On the data from the accounting system is built production planning. Conditions allow you to accept all changes for the day and recalculate the product specification at night. But, as I wrote above, it is not clear how the state of yesterday differs from the state of today.
')
I want to see what was left of the wood product, and what was added. Just want to see which part or assembly replaced another part or assembly. For example, if an intermediate node was added to a tree branch, then it would be wrong to assume that all the subordinate elements were removed from the old places and inserted into the new places. They remained in place, but an intermediate node was inserted. In addition, the element can "travel" up and down only within one branch of the tree, this is due to the specifics of the manufacturing process.



Training


The product specification is recalculated by the PL / SQL procedure inside Oracle. Therefore, I found it logical to do my search for changes to PL / SQL.

The table in which the product tree is stored, let's call it MR_ORDER_TREE , has the following form:
order_idOrder ID on which the tree is attached
item_idThe element ID in the tree, together with order_id, forms the primary key and is unique within the order.
item_refID of the element to which the selected element belongs
kts_item_idID from the directory of parts and assemblies
item_qtyamount
is_item_buyIs the product purchased?
item_positionItem number in assembly

Bundles (item_ref, kts_item_id) unique. In addition to the purchase, the position and quantity there are other attributes of a particular element, but it is not about them.

At recalculation of the specification, the data, under the changed orders, completely are removed and are considered again. Therefore it is necessary to preserve the state of the tree before recalculation. For this we use a similar table MR_ORDER_TREE_PREV .

The result of the comparison will be stored in the MR_ORDER_TREE_COMP table, which will additionally have two columns:
statColumn to mark the status of items:
-1 - additional root element (about it below)
0 - item removed
1 - item added
2 - item properties have changed
3 - unknown condition (just in case something went wrong)
4 - the item has not changed
commComment giving additional data as

First, take the previous and current trees and drop them into the MR_ORDER_TREE_COMP table. In this case, we add to them the common root, item_id at the current tree will increase by (maximum value + 1) item_id tree with the previous state. All the elements of the old tree will be considered as deletion, and all the elements of the new one as an insert.

 select nvl(max(item_id), 0) + 1 into v_id from mr_order_tree_prev t where t.order_id = p_order_id; insert into MR_ORDER_TREE_COMP (ORDER_ID, ITEM_ID, KTS_ITEM_ID, ITEM_QTY, IS_ITEM_BUY, IS_ADD_WORK, ITEM_POSITION, N_GROUP, T_LEVEL, STAT, COMM) values (p_order_id, -1, null, 0, 0, 0, 0, 0, 0, -1, '   '); insert into MR_ORDER_TREE_COMP (ORDER_ID, ITEM_ID, ITEM_REF, KTS_ITEM_ID, KTS_ITEM_REF, ITEM_QTY, IS_ITEM_BUY, IS_ADD_WORK, ITEM_POSITION, N_GROUP, STAT, COMM) select p.order_id, p.item_id, nvl(p.item_ref, -1), p.kts_item_id, p.kts_item_ref, p.item_qty, p.is_item_buy, p.is_add_work, p.item_position, p.n_group, 0, '' from mr_order_tree_prev p where p.order_id = p_order_id; insert into MR_ORDER_TREE_COMP (ORDER_ID, ITEM_ID, ITEM_REF, KTS_ITEM_ID, KTS_ITEM_REF, ITEM_QTY, IS_ITEM_BUY, IS_ADD_WORK, ITEM_POSITION, N_GROUP, STAT, COMM) select p.order_id, p.item_id + v_id, case when p.item_ref is null then -1 else p.item_ref + v_id end, p.kts_item_id, p.kts_item_ref, p.item_qty, p.is_item_buy, p.is_add_work, p.item_position, p.n_group, 1, '' from mr_order_tree p where p.order_id = p_order_id; 


We also put down on the elements of the total tree their nesting level in order not to calculate it every time.

 for rec in (select item_id, level lev from (select * from mr_order_tree_comp where order_id = p_order_id) connect by prior item_id = item_ref start with item_id = -1) loop update mr_order_tree_comp c set c.t_level = rec.lev where c.order_id = p_order_id and c.item_id = rec.item_id; end loop; 

Save the state of the mr_order_tree_comp table for the recalculated order, this will be needed in the future. I used the collection, but I think that a temporary table can also be applied.

  procedure save_tree_stat(p_order in number) is begin select TREE_BC_STAT_ROW(c.order_id, c.item_id, c.item_ref, c.kts_item_id, c.kts_item_ref) bulk collect into tree_before_calc from mr_order_tree_comp c where c.order_id = p_order; end save_tree_stat; 

"The addition of" trees


Now you need to go through the resulting tree through the levels and nodes in search of change. To begin, we define the maximum level:

 select max(t_level) into v_max_lvl from mr_order_tree_comp where order_id = p_order_id; 

Now in the cycle go through the levels of this tree. At each level, also in the loop, we will select the child elements of each node, and look for the element with the “opposite sign”. Simply put, look for the same kts_item_id with the same item_ref , but state 1 for 0 and 0 for 1. After that, one of them should be deleted and the incoming elements should be reassigned to the remaining node.

When an item is processed, place it on some list of processed items. I used the following procedure:

  procedure add_to_rdy (p_item in number, p_order in number) is begin item_ready_list.extend; item_ready_list(item_ready_list.last) := tree_rdy_list_row(p_order, p_item); end add_to_rdy; 

Element "Handling" returned by function

 function item_rdy(p_item in number, p_order in number) return number 

which was browsing the same collection.

The cycle is as follows:

 <<lvls>> for i in 1..v_max_lvl loop <<heads>> for rh in (select c.* from mr_order_tree_comp c where c.order_id = p_order_id and c.t_level = i) loop <<leafs>> for rl in (select c.* from mr_order_tree_comp c where c.order_id = p_order_id and c.item_ref = rh.item_id and c.stat in (0, 1) order by c.stat) loop if (item_rdy(rl.item_id, rl.order_id) = 0) then if (rl.stat = 0) then select count(*) into v_cnt from mr_order_tree_comp c where c.order_id = p_order_id and c.item_ref = rh.item_id and c.kts_item_id = rl.kts_item_id and c.stat = 1; case when (v_cnt = 1) then select c.item_id into v_item from mr_order_tree_comp c where c.order_id = p_order_id and c.item_ref = rh.item_id and c.kts_item_id = rl.kts_item_id and c.stat = 1; update mr_order_tree_comp c set c.item_ref = v_item where c.item_ref = rl.item_id and c.order_id = p_order_id; update mr_order_tree_comp c set c.stat = 4 where c.item_id = v_item and c.order_id = p_order_id; diff_items(p_order_id, rl.item_id, v_item); delete mr_order_tree_comp c where c.item_id = rl.item_id and c.order_id = p_order_id; add_to_rdy(rl.item_id, rl.order_id); add_to_rdy(v_item, rl.order_id); end case; end if; 

For (rl.stat = 1) logic is similar.

When the same element was found, everything is simple - you just need to compare their properties. To do this, use the diff_items function. The situation, when more than one element is found, is more likely an exception and says that there is something wrong with the composition tree.

Search for "similar" items


But what to do when there was a replacement of one node by another, was the node inserted in the middle or was the node in the middle removed?

To determine the situations described, I did not find anything smarter than just comparing the compositions of two nodes in order to determine the ratio of the number of identical kts_item_id to the total number of elements. If the value of this relationship is greater than a certain value, then the nodes are interchangeable. If at the current loop iteration the node has several replacement options, then the variant with the highest “similarity coefficient” is taken.

Perhaps with such a bold decision I will ever shoot myself in the foot.

Add some code to the main CASE .

 when (v_cnt = 0) then select count(*) into v_cnt from mr_order_tree_comp c where c.order_id = p_order_id and c.item_ref = rh.item_id and c.stat = 1 and not exists (select 1 from table(item_ready_list) a where a.order_id = c.order_id and a.item_id = c.item_id); 

Managed to find one item in this node.

  if (v_cnt = 1) then select c.item_id, c.kts_item_id into v_item, v_kts from mr_order_tree_comp c where c.order_id = p_order_id and c.item_ref = rh.item_id and c.stat = 1 and not exists (select 1 from table(item_ready_list) a where a.order_id = c.order_id and a.item_id = c.item_id); 

We rehash the contents of the current node without hesitation. Provided that both elements are assemblies. The current item is not deleted. Why? If there are no identical elements in the nodes at all, then everything will be returned to their places.

  if (is_item_comp(v_item, p_order_id) = is_item_comp(rl.item_id, p_order_id)) then update mr_order_tree_comp c set c.item_ref = v_item where c.item_ref = rl.item_id and c.order_id = p_order_id; add_to_rdy(rl.item_id, rl.order_id); add_to_rdy(v_item, rl.order_id); end if; end if; 

If you manage to find multiple items, then the most similar assembly is taken. To define “similarity”, the like_degree procedure is like_degree , the value of the coefficient for comparison is contained in the variable lperc .

 if (v_cnt > 1) then begin select item_id, kts_item_id, max_lperc into v_item, v_kts, v_perc from (select c.item_id, c.kts_item_id, max(like_degree(rl.item_id, c.item_id, c.order_id)) max_lperc from mr_order_tree_comp c where c.order_id = p_order_id and c.item_ref = rh.item_id and c.stat = 1 and not exists (select 1 from table(item_ready_list) a where a.order_id = c.order_id and a.item_id = c.item_id) and is_item_comp(c.item_id, p_order_id) = (select is_item_comp(rl.item_id, p_order_id) from dual) group by c.item_id, c.kts_item_id order by max_lperc desc) where rownum < 2; if (v_perc >= lperc) then update mr_order_tree_comp c set c.item_ref = v_item where c.item_ref = rl.item_id and c.order_id = p_order_id; update mr_order_tree_comp c set c.comm = '.   ' || kts_pack.item_code(v_kts) || ' (' || to_char(v_perc) || '%)' where c.item_id = rl.item_id and c.order_id = p_order_id; add_to_rdy(rl.item_id, rl.order_id); add_to_rdy(v_item, rl.order_id); end if; end; end if; 

As a result, some of the elements will be re-linked, and the tree will look like this.

If the additional root element that was added at the beginning is not needed, then it can be removed.

Now you can take all the remaining elements with status 0 and 1, and starting with them go up to the root. If the same element with “opposite status” is found, then compare them, remove the element from the tree with 0, and change the status of the element with 1.

 <<items>> for rs in (select * from mr_order_tree_comp c where c.order_id = p_order_id and c.stat in (0,1)) loop <<branch>> for rb in (select * from (select * from mr_order_tree_comp c where c.order_id = p_order_id) t connect by prior t.item_ref = t.item_id start with t.item_id = rs.item_id) loop select count(*) into v_cnt from mr_order_tree_comp c where c.item_ref = rb.item_id and c.kts_item_id = rs.kts_item_id and c.stat in (1,0) and c.stat != rs.stat; if (v_cnt = 1) then select c.item_id into v_item from mr_order_tree_comp c where c.item_ref = rb.item_id and c.kts_item_id = rs.kts_item_id and c.stat in (1,0) and c.stat != rs.stat; if (rs.stat = 0) then update mr_order_tree_comp c set c.stat = 4 where c.order_id = p_order_id and c.item_id = v_item; diff_items(p_order_id, rs.item_id, v_item); update mr_order_tree_comp c set c.item_ref = v_item where c.order_id = p_order_id and c.item_ref = rs.item_id; delete mr_order_tree_comp where order_id = p_order_id and item_id = rs.item_id; end if; if (rs.stat = 1) then update mr_order_tree_comp c set c.stat = 4 where c.order_id = p_order_id and c.item_id = rs.item_id; diff_items(p_order_id, rs.item_id, v_item); update mr_order_tree_comp c set c.item_ref = rs.item_id where c.order_id = p_order_id and c.item_ref = v_item; delete mr_order_tree_comp where order_id = p_order_id and item_id = v_item; end if; continue items; end if; end loop branch; end loop items; 

Now let's go through the remaining items with status 0 and restore their previous item_ref . To do this, use the tree_before_calc collection, in which the initial state of the tree mr_order_tree_comp .

After that, the tree gets the desired form.



I believe that there are more beautiful, fast and correct ways to compare trees. This is my option and I hope that it will be useful to someone and show you how to do it or how not to do it.

Source: https://habr.com/ru/post/334384/


All Articles