The complexity and diversity of government regulations make understanding and retrieval of regulations a non-trivial task. One of the issues is the existence of multiple sources of regulations and interpretive guides with differences in format, terminology and context. In this work, an information infrastructure is proposed for regulation management and analysis, which includes a consolidated document repository and tools for similarity analysis. The corpus covers accessibility and environmental regulations from the US Federal government, California state government, non-profit organizations and some European agencies.
The regulatory repository is to be populated with regulations in XML format. XML is chosen as the representation format because it is well suited for handling semi-structured data such as legal documents. A shallow parser is developed to consolidate regulations published in different formats, for example, PDF or HTML, into XML. The shallow parser also extracts important features, such as concepts, measurements, definitions and so on, and incorporates them into the XML structure.
Having a well-formed regulatory repository, analysis tools are developed to help retrieval of related provisions from different domains of regulations. The theory and implementation of a relatedness analysis framework is presented. The goal is to identify the most strongly related provisions using not only a traditional term match but also a combination of feature matches, and not only content comparison but also structural analysis. Regulations are first compared based on conceptual information as well as domain knowledge through a combination of feature matching. Regulations also possess specific structures, such as a tree hierarchy of provisions and the referential structure. These structures represent useful information in locating related provisions, and are therefore exploited in the analysis for a complete comparison.
System performance is evaluated by comparing a similarity ranking produced by users with the machine-predicted ranking. Ranking produced by the relatedness analysis system shows a reduction in error compared to that of Latent Semantic Indexing. Various pairs of regulations are compared and the results are analyzed along with observations based on different feature usages. An example of an e-rulemaking scenario is shown to demonstrate capabilities of the prototype system.
Cited By
- Law K and Lau G REGNET Proceedings of the 6th International Conference on Theory and Practice of Electronic Governance, (175-183)
- Cheng C, Pan J, Lau G, Law K and Jones A Relating taxonomies with regulations Proceedings of the 2008 international conference on Digital government research, (34-43)
- Cheng C, Lau G and Law K Mapping regulations to industry-specific taxonomies Proceedings of the 11th international conference on Artificial intelligence and law, (59-63)
- Lau G, Wang H and Law K Locating related regulations using a comparative analysis approach Proceedings of the 2006 international conference on Digital government research, (229-238)
- Lau G, Law K and Wiederhold G (2005). Analyzing Government Regulations Using Structural and Domain Information, Computer, 38:12, (70-76), Online publication date: 1-Dec-2005.
- Lau G, Law K and Wiederhold G Legal information retrieval and application to e-rulemaking Proceedings of the 10th international conference on Artificial intelligence and law, (146-154)
- Lau G, Wang H, Law K and Wiederhold G A relatedness analysis approach for regulation comparison and e-rulemaking applications Proceedings of the 2005 national conference on Digital government research, (69-77)
Index Terms
- A comparative analysis framework for semi-structured documents, with applications to government regulations
Recommendations
Similarity analysis on government regulations
KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data miningGovernment regulations are semi-structured text documents that are often voluminous, heavily cross-referenced between provisions and even ambiguous. Multiple sources of regulations lead to difficulties in both understanding and complying with all ...
e-government applications in Bangladesh: status and challenges
ICEGOV '10: Proceedings of the 4th International Conference on Theory and Practice of Electronic GovernanceThis paper explores public sector initiatives for developing e-government applications, specifically the supply-side perspective in Bangladesh. This study extracts the development activities from public websites and e-services for citizens and ...