when is a clone not a clone
play

When is a Clone not a Clone? (and vice-versa) Contextualized - PowerPoint PPT Presentation

When is a Clone not a Clone? (and vice-versa) Contextualized Analysis of Web Services Douglas Martin James R. Cordy Scott Grant David B. Skillicorn School of Computing Kingston, Canada Motivation The Personal Web Rapidly growing


  1. When is a Clone not a Clone? (and vice-versa) Contextualized Analysis of Web Services Douglas Martin James R. Cordy Scott Grant David B. Skillicorn School of Computing Kingston, Canada

  2. Motivation — The Personal Web — Rapidly growing number of web services makes it increasingly difficult to find and choose the right ones — Need a quick and convenient way to find alternatives — Hand tagging impractical – automation is needed!

  3. Motivation — Automation — Similarity detection techniques offer solutions! — Code clone detection from software engineering research can find similar code fragments – why not similar services? — Topic models from data mining research can find text documents with similar semantics – why not similar services?

  4. Web Service Similarity — Web services are stored in service registries, containing WSDL service description files — Could apply clone detection to entire service descriptions — But what we really want are similar service operations

  5. Let’s try it! <operation name=" GetStock " > <input message="tns:GetStockRequest" /> <complexType name=“Stock”> <output message="tns:GetStockResponse" /> <sequence> </operation> <element name=“ Supplier ” type=“xsd:string”/> <element name=“ Warehouse ” type=“xsd:string”/> <element name=“ OnHand ” type=“xsd:string”/> <element name=“ OnOrder ” type=“xsd:string”/> <element name=“ Demand ” type=“xsd:string”/> </sequence> </complexType > <operation name=" GetStock " > <input message="tns:GetStockRequest" /> <complexType name=“Stock”> <output message="tns:GetStockResponse" /> <sequence> </operation> <element name=“ date ” type=“xsd:string”/> <element name=“ open ” type=“xsd:float”/> <element name=“ high ” type=“xsd:float”/> <element name=“ low ” type=“xsd:float”/> <element name=“ close ” type=“xsd:float”/> <element name=“ volume ” type=“xsd:float”/> </sequence> </complexType >

  6. How about these? <operation name=“ DrawRateChartCustom ”> <input message=“DrawRateChartCustomIn”/> <output message=“DrawRateChartCustomOut”/> </operation> <operation name=" GetTopicBinaryChartCustom "> <input message="GetTopicBinaryChartCustomSoapIn"/> <output message="GetTopicBinaryChartCustomSoapOut"/> </operation>

  7. So what went wrong? — At this point we thought maybe our idea wasn’t going to work — Maybe clone detection can’t help with web service discovery? — But why? What’s so special about WSDL?

  8. Web Service Description Language (WSDL) — A WSDL service description has 3 main parts:

  9. Web Service Description Language (WSDL) — A WSDL service description has 3 main parts: — a <portType> element where the operations are declared;

  10. Web Service Description Language (WSDL) — A WSDL service description has 3 main parts: — a <portType> element where the operations are declared; — <message> elements corresponding to inputs, outputs and faults of the operations;

  11. Web Service Description Language (WSDL) — A WSDL service description has 3 main parts: — a <portType> element where the operations are declared; — <message> elements corresponding to inputs, outputs and faults of the operations; — and a <types> element containing an XML Schema that defines the data and structure types used in the messages

  12. Web Service Description Language (WSDL) — This simple example service has two operations:

  13. Web Service Description Language (WSDL) — This simple example service has two operations: — ReserveRoom

  14. Web Service Description Language (WSDL) — This simple example service has two operations: — ReserveRoom — GetAvailableRooms

  15. Web Service Description Language (WSDL) — WSDL service description files contain descriptions of the operations that a web service has to offer — But the pieces of each operation’s own description are scattered over different parts of the WSDL file — Difficult to identify complete units to analyze and compare

  16. The Problem — This poses a problem for analysis techniques: — Operations cannot easily be compared for similarity using clone detectors, because there are no contiguous fragments to compare — And they cannot be analyzed using data mining topic models, because there are no separate complete documents to generate a model from

  17. Our Solution — Our solution is to contextualize the original <operation> elements, to create self-contained operation descriptions — We use source transformation to inline remote information from the context into the elements that reference or depend on them — We call these contextualized WSDL operations Web Service Cells, or WSCells — The first example of a new kind of clone detection: contextual clones

  18. Contextualizing WSDL Operations

  19. Contextual Clone Detection

  20. An Experiment — We have run an experiment to investigate the difference between clone detection on WSCells and original raw operations — Two sets of WSDL service description files: 1,100 operations and 7,500 operations — Compared NICAD clone detector results for each set at various near-miss difference thresholds 0% = exact clone, 10% = 1 line in 10 different, and so on

  21. An Experiment — Number of clones decreases with WSCells Clone ¡Pairs ¡in ¡Set ¡1 ¡ Clone ¡Pairs ¡in ¡Set ¡2 ¡ Difference ¡ Threshold ¡ Originals ¡ WSCells ¡ Originals ¡ WSCells ¡ 0.0 ¡ 852 ¡ 705 ¡ 1434 ¡ 1066 ¡ 0.1 ¡ 852 ¡ 734 ¡ 1434 ¡ 1228 ¡ 0.2 ¡ 879 ¡ 775 ¡ 1438 ¡ 1637 ¡ 0.3 ¡ 884 ¡ 813 ¡ 1469 ¡ 1637 ¡ <operation name=" GetStock " > <input message="tns:GetStockRequest" /> <complexType name=“Stock”> <output message="tns:GetStockResponse" /> <sequence> </operation> <element name=“ Supplier ” type=“xsd:string”/> <element name=“ Warehouse ” type=“xsd:string”/> <element name=“ OnHand ” type=“xsd:string”/> — Reduction in <element name=“ OnOrder ” type=“xsd:string”/> <element name=“ Demand ” type=“xsd:string”/> </sequence> </complexType > false positives <operation name=" GetStock " > <input message="tns:GetStockRequest" /> <complexType name=“Stock”> <output message="tns:GetStockResponse" /> <sequence> </operation> <element name=“ date ” type=“xsd:string”/> <element name=“ open ” type=“xsd:float”/> <element name=“ high ” type=“xsd:float”/> <element name=“ low ” type=“xsd:float”/> <element name=“ close ” type=“xsd:float”/> <element name=“ volume ” type=“xsd:float”/> </sequence> </complexType >

  22. An Experiment — Number of clone classes can increase with WSCells Clone ¡Classes ¡in ¡Set ¡1 ¡ Clone ¡Classes ¡in ¡Set ¡2 ¡ Difference ¡ Threshold ¡ Originals ¡ WSCells ¡ Originals ¡ WSCells ¡ 0.0 ¡ 169 ¡ 187 ¡ 587 ¡ 433 ¡ 0.1 ¡ 169 ¡ 139 ¡ 587 ¡ 499 ¡ 0.2 ¡ 172 ¡ 142 ¡ 589 ¡ 631 ¡ 0.3 ¡ 171 ¡ 136 ¡ 591 ¡ 631 ¡ <operation name=" GetStock " > <input message="tns:GetStockRequest" /> <complexType name=“Stock”> <output message="tns:GetStockResponse" /> <sequence> </operation> <element name=“ Supplier ” type=“xsd:string”/> <element name=“ Warehouse ” type=“xsd:string”/> <element name=“ OnHand ” type=“xsd:string”/> — Splits by deeper <element name=“ OnOrder ” type=“xsd:string”/> <element name=“ Demand ” type=“xsd:string”/> </sequence> differences – </complexType > <operation name=" GetStock " > more precision <input message="tns:GetStockRequest" /> <complexType name=“Stock”> <output message="tns:GetStockResponse" /> <sequence> </operation> <element name=“ date ” type=“xsd:string”/> <element name=“ open ” type=“xsd:float”/> <element name=“ high ” type=“xsd:float”/> <element name=“ low ” type=“xsd:float”/> <element name=“ close ” type=“xsd:float”/> <element name=“ volume ” type=“xsd:float”/> </sequence> </complexType >

Recommend


More recommend