Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

A Text Analysis-Based Predictive Approach for Assessing Clause Risk: An Application in Construction Contracts

The identification and evaluation of risky contractual clauses remain a critical challenge in the construction industry during the tender phase. Such clauses can expose the contractors and other parties to conflicts and disputes, which will cause delays and cost overruns during project execution. He...

Full description

Saved in:
Bibliographic Details
Main Author: Abouelwy, Seifeldin Ahmed
Format: Thesis
Published: AUC Knowledge Fountain 2026
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613431860822016
access_status_str Open Access
author Abouelwy, Seifeldin Ahmed
author_browse Abouelwy, Seifeldin Ahmed
author_facet Abouelwy, Seifeldin Ahmed
author_sort Abouelwy, Seifeldin Ahmed
collection Thesis
description The identification and evaluation of risky contractual clauses remain a critical challenge in the construction industry during the tender phase. Such clauses can expose the contractors and other parties to conflicts and disputes, which will cause delays and cost overruns during project execution. Hence, the traditional contract review processes that are currently in practice and the time-consuming assessment methods that rely heavily on expert judgment and manual procedures cause inconsistency and lead to human error, particularly when multiple contracts must be reviewed under strict deadlines. To address these issues, this study introduces an automated data-driven framework and will be referred to as Contracts Assessment Tool (CAT) that leverages text mining techniques and machine learning algorithms to enhance the contract evaluation process. This CAT integrates contractual clauses collected from multiple projects with expert assessments and generates automated report in which each clause is classified according to its impact level, probability of occurrence, and similarity compared to a reference contract. To achieve this objective, two main paths were undertaken. First, a text extraction model was developed to accurately identify, extract, and compare contractual clauses. Second, data were collected from contracts and experts, then preprocessed and visualized to extract meaningful insights. Finally, clause risk probability and impact classification models were developed and validated using different machine learning techniques such as Random Forest, SVM, KNN, XGBoost, Naïve Bayes, and Logistic Regression. The results showed that the Logistic Regression achieved the best results with an accuracy of 0.740 and F1-score of 0.736 for the risk model, and an accuracy of 0.710 with F1-score of 0.707 for the probability model. However, the use of resampling techniques, particularly ADASYN approach enhanced the models' performance, with the SVM achieving an accuracy of 0.922 and F1-score of 0.921 for the risk model, and an accuracy of 0.928 with F1-score of 0.926 for the probability model. Finally, the CAT was tested on two contract documents from different projects, where it successfully identified, extracted, and evaluated clauses by assigning accurate classifications for both impact and probability of occurrence. These results demonstrate CAT’s capability to support contract engineers by accelerating the review process, reducing human error, and improving efficiency, showing the potential of having an automated and machine learning based tools to enhance contract evaluation and strengthen contract risk identification in the pre-award phase.
format Thesis
id oai:fount.aucegypt.edu:etds-3736
institution American University in Cairo (Egypt)
last_indexed 2026-06-10T12:35:59.828Z
license_str Not specified — see source repository
provenance_str_mv Harvested via OAI-PMH from AUC Knowledge Fountain — bepress
publishDate 2026
publishDateRange 2026
publishDateSort 2026
publisher AUC Knowledge Fountain
publisherStr AUC Knowledge Fountain
record_format dspace
source_str AUC Knowledge Fountain — bepress
spelling oai:fount.aucegypt.edu:etds-3736 A Text Analysis-Based Predictive Approach for Assessing Clause Risk: An Application in Construction Contracts Abouelwy, Seifeldin Ahmed The identification and evaluation of risky contractual clauses remain a critical challenge in the construction industry during the tender phase. Such clauses can expose the contractors and other parties to conflicts and disputes, which will cause delays and cost overruns during project execution. Hence, the traditional contract review processes that are currently in practice and the time-consuming assessment methods that rely heavily on expert judgment and manual procedures cause inconsistency and lead to human error, particularly when multiple contracts must be reviewed under strict deadlines. To address these issues, this study introduces an automated data-driven framework and will be referred to as Contracts Assessment Tool (CAT) that leverages text mining techniques and machine learning algorithms to enhance the contract evaluation process. This CAT integrates contractual clauses collected from multiple projects with expert assessments and generates automated report in which each clause is classified according to its impact level, probability of occurrence, and similarity compared to a reference contract. To achieve this objective, two main paths were undertaken. First, a text extraction model was developed to accurately identify, extract, and compare contractual clauses. Second, data were collected from contracts and experts, then preprocessed and visualized to extract meaningful insights. Finally, clause risk probability and impact classification models were developed and validated using different machine learning techniques such as Random Forest, SVM, KNN, XGBoost, Naïve Bayes, and Logistic Regression. The results showed that the Logistic Regression achieved the best results with an accuracy of 0.740 and F1-score of 0.736 for the risk model, and an accuracy of 0.710 with F1-score of 0.707 for the probability model. However, the use of resampling techniques, particularly ADASYN approach enhanced the models' performance, with the SVM achieving an accuracy of 0.922 and F1-score of 0.921 for the risk model, and an accuracy of 0.928 with F1-score of 0.926 for the probability model. Finally, the CAT was tested on two contract documents from different projects, where it successfully identified, extracted, and evaluated clauses by assigning accurate classifications for both impact and probability of occurrence. These results demonstrate CAT’s capability to support contract engineers by accelerating the review process, reducing human error, and improving efficiency, showing the potential of having an automated and machine learning based tools to enhance contract evaluation and strengthen contract risk identification in the pre-award phase. 2026-03-03T08:00:00Z thesis application/pdf https://fount.aucegypt.edu/etds/2674 https://fount.aucegypt.edu/context/etds/article/3736/viewcontent/seifeldin_ahmed_abouelwy_thesis.pdf Theses and Dissertations AUC Knowledge Fountain Text Analysis Construction Contracts Machine Learning Construction Engineering and Management
spellingShingle Text Analysis
Construction Contracts
Machine Learning
Construction Engineering and Management
Abouelwy, Seifeldin Ahmed
A Text Analysis-Based Predictive Approach for Assessing Clause Risk: An Application in Construction Contracts
title A Text Analysis-Based Predictive Approach for Assessing Clause Risk: An Application in Construction Contracts
title_full A Text Analysis-Based Predictive Approach for Assessing Clause Risk: An Application in Construction Contracts
title_fullStr A Text Analysis-Based Predictive Approach for Assessing Clause Risk: An Application in Construction Contracts
title_full_unstemmed A Text Analysis-Based Predictive Approach for Assessing Clause Risk: An Application in Construction Contracts
title_short A Text Analysis-Based Predictive Approach for Assessing Clause Risk: An Application in Construction Contracts
title_sort text analysis based predictive approach for assessing clause risk an application in construction contracts
topic Text Analysis
Construction Contracts
Machine Learning
Construction Engineering and Management
url https://fount.aucegypt.edu/etds/2674
https://fount.aucegypt.edu/context/etds/article/3736/viewcontent/seifeldin_ahmed_abouelwy_thesis.pdf
work_keys_str_mv AT abouelwyseifeldinahmed atextanalysisbasedpredictiveapproachforassessingclauseriskanapplicationinconstructioncontracts
AT abouelwyseifeldinahmed textanalysisbasedpredictiveapproachforassessingclauseriskanapplicationinconstructioncontracts