Automated Lab Submission & Integrity Checker (Python OOP) • Record reality Piece memory

🧩 Introduction

This project was developed as a Python final project in May 2025.
It automates the process of managing student lab submissions — renaming files, verifying submission integrity, and generating analytical reports.

It originated from a real problem in college courses:
instructors often receive hundreds of student .docx submissions with inconsistent naming conventions, misplaced files, and even duplicate or cross-referenced content.
The goal was to design a robust, automated, and object-oriented solution.

🧰 File Management

Standardizing file names with course code and student information
Matching student IDs with class roster sequence numbers for organized grading
Ensuring consistent file naming across all submissions

🔍 Content Analysis

Detecting shared content between submissions by identifying student IDs
Supporting both plagiarism detection and group work verification:
- Identifying unauthorized sharing in individual assignments
- Extracting team member information in group projects
Generating detailed cross-reference reports for instructor review

🗂 Demo Directory Structure

├── project_main.py                    Main program file
├── student_roster.csv                 Roster file with anonymized sequence numbers
│
├── sample_submissions/                Source directory for anonymized submissions
│   ├── Lab1_AAAA0001_COURSECODE_.docx
│   ├── BBBB0002_COURSECODE_Lab2.docx
│   ├── COURSECODE_Lab3_CCCC0003.docx
│   └── ... (other anonymized submission files)
│
└── output_processed/                  Directory for renamed and analyzed files
    ├── COURSECODE_A1_01_AAAA0001.docx   (Renamed with sequence number)
    ├── COURSECODE_A1_02_BBBB0002.docx   (Standardized naming format)
    ├── report_check.txt                 (Cross-reference report)
    └── file_summary.txt                 (Processing summary)

🧩 Key Features

├── Class: Student
│   ├── __init__():                     Initialize student object with basic information
│   ├── parse_name():                   Parse full name into first and last name
│   └── generate_condor_id():           Generate Condor ID from classlist.csv information
│
├── Class: FileChecker
│   ├── __init__():                     Initialize checker with directories and course info
│   ├── load_student_data():            Load and map student roster sequence numbers
│   ├── extract_id_from_filename():     Extract student ID from various filename formats
│   ├── check_file_content():           Identify other student IDs in file content
│   ├── process_files():                Rename files with roster sequence numbers
│   ├── generate_check_report():        Create content cross-reference analysis
│   └── generate_file_list():           Summarize file processing results
│
└── Function: main()
    └── Program entry point and user interaction handling

🧠 Code Logic Example

🔹 ID Extraction via Regex

def extract_id_from_filename(self, filename):
    patterns = [r"([a-zA-Z]{4}\\d{4})", r"_([a-zA-Z]{4}\\d{4})[\\._]"]
    for pattern in patterns:
        match = re.search(pattern, filename.lower())
        if match:
            return match.group(1)
    return None

🔹 Cross-Reference Detection

def check_file_content(self, file_path, id_list):
    '''Scans a .docx file to find any other student IDs mentioned in the content.'''
    found_ids = set()
    try:
        doc = Document(file_path)
        for para in doc.paragraphs:
            for sid in id_list:
                if sid in para.text:
                    found_ids.add(sid)
    except Exception:
        pass
    return list(found_ids)

📄 Output Format

Renamed files:
COURSECODE_ASSIGNMENT_##_studentid.docx
(represents the student’s sequence number from the class roster)
Cross-reference report:
Details of student ID mentions across files
Processing summary:
Complete file handling results

🧰 Technical Summary

Concept	Implementation
Language	Python 3.12
Paradigm	Object-Oriented Programming (OOP)
Libraries	`os`, `re`, `csv`, `docx`, `collections`
Techniques	Regex parsing, I/O automation, exception handling
Output Files	`file_list.txt`, `check.txt`
Data Source	Roster CSV + `.docx` submissions

🧭 Lessons Learned

Through this project, I gained hands-on experience with:

Designing structured automation workflows using OOP principles
Managing file systems and text data parsing
Applying regex to real-world problems
Handling errors and exceptions for robust operation

This project helped me understand how automation tools can transform tedious, error-prone tasks into efficient workflows.
It also strengthened my confidence in building real-world Python applications that merge practicality with clean design.

🔒 Notes on Data Privacy

All student names, IDs, and course codes have been anonymized for public release.
Original data and institutional references were removed to ensure confidentiality and compliance.