Skip to content

Commit 9d2c25b

Browse files
committed
feat: implement submission file workflow with comprehensive testing
- Create Submission file model - Develop SubmissionFileManager for centralized DB query handling - Add TestSubmissionFileManager with extensive test coverage - Test file processing with different input types (bytes, file objects) - Implement error handling tests for IO errors and invalid files - Verify complete file processing workflow - Validation tests for file addition in create_external_grader_detail - Document architectural decisions with ADR - Add test queue folder in .gitignore
1 parent 9ce14ad commit 9d2c25b

File tree

8 files changed

+665
-0
lines changed

8 files changed

+665
-0
lines changed

Diff for: .gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -63,3 +63,4 @@ test.txt.tmp
6363

6464
# VSCode
6565
.vscode
66+
test_queue/*
+150
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,150 @@
1+
2. File Handling Implementation for Submission System
2+
#####################################################
3+
4+
Status
5+
******
6+
7+
**Provisional** *2025-02-14*
8+
9+
Implemented by: waiting for it
10+
11+
Context
12+
*******
13+
14+
As part of the XQueue migration effort detailed in ADR 1, we need to handle file submissions that were previously
15+
managed by the XQueue system. The existing implementation lacks dedicated file handling capabilities within
16+
edx-submissions, requiring:
17+
18+
1. File Management:
19+
- Secure storage of student-submitted files
20+
- Association with specific submissions
21+
- Compatibility with XQueue URL formats
22+
- Support for multiple file types and formats
23+
24+
2. Processing Requirements:
25+
- Robust error handling for file operations
26+
- Validation of file objects
27+
- Dynamic path generation
28+
- Proper cleanup and resource management
29+
30+
3. Integration Points:
31+
- Connection with SubmissionQueueRecord
32+
- Support for existing xqueue-watcher interface
33+
- File URL format compatibility
34+
- Efficient database querying
35+
36+
The current XQueue system implements file handling through its Submission model, which manages file storage and
37+
retrieval in a tightly coupled way. The existing implementation:
38+
39+
1. Current File Management in XQueue:
40+
- Uses a Submission model with direct file storage fields (s3_keys and s3_urls)
41+
- Handles file uploads through manual storage management functions (upload_file_dict, upload)
42+
- Stores file URLs as JSON strings in model fields
43+
- Requires synchronous HTTP communication for file retrieval
44+
45+
2. Submission Model File Fields:
46+
```python
47+
s3_keys = models.CharField(max_length=CHARFIELD_LEN_LARGE) # keys for internal use
48+
s3_urls = models.CharField(max_length=CHARFIELD_LEN_LARGE) # urls for external access
49+
```
50+
51+
3. Current Workflow Issues:
52+
- File handling is tightly coupled with submission processing
53+
- Relies on manual URL construction and string manipulation
54+
- Lacks proper file validation and type checking
55+
- Limited by CharField size for storing file information
56+
- Requires complex get_submission logic for file retrieval
57+
58+
4. Xqueue-watcher Integration:
59+
- External graders rely on specific URL formats
60+
- File access depends on HTTP-based retrieval
61+
- File information is embedded in submission payload
62+
63+
Decision
64+
********
65+
66+
We will implement a new SubmissionFile model and supporting infrastructure to handle file management within
67+
edx-submissions:
68+
69+
1. Core Model Structure:
70+
- SubmissionFile model with UUID-based identification
71+
- Foreign key relationship to SubmissionQueueRecord
72+
- Django FileField for actual file storage
73+
- Original filename preservation
74+
- Creation timestamp tracking
75+
- Composite index for efficient querying
76+
77+
2. File Management Layer:
78+
- SubmissionFileManager class for encapsulated file operations
79+
- Robust file processing with multiple format support
80+
- Error handling for various file-related exceptions
81+
- XQueue-compatible URL generation
82+
83+
3. Integration with Submission Creation:
84+
- Extended create_external_grader_detail function
85+
- File processing during submission creation
86+
- Automatic file manager instantiation
87+
- Error handling and logging
88+
89+
Consequences
90+
************
91+
92+
Positive:
93+
---------
94+
95+
1. Architecture:
96+
- Clean separation of concerns for file handling
97+
- Improved maintainability through dedicated models
98+
- Better error handling and logging
99+
- Efficient database querying through proper indexing
100+
101+
2. Integration:
102+
- Seamless xqueue-watcher compatibility
103+
- Support for existing file processing workflows
104+
- Minimal changes required to client code
105+
106+
3. Operations:
107+
- More robust file processing
108+
- Better tracking of file submissions
109+
- Improved error visibility
110+
- Simplified file management
111+
112+
Negative:
113+
---------
114+
115+
1. Technical Impact:
116+
- Additional database tables and indexes
117+
- Increased storage requirements
118+
- More complex submission creation flow
119+
120+
2. Migration Considerations:
121+
- Temporary increased system complexity
122+
- Additional testing requirements
123+
124+
3. Performance:
125+
- Additional database operations during submission
126+
- File processing overhead
127+
128+
References
129+
**********
130+
131+
Current System Documentation:
132+
* XQueue Repository: https://github.com/openedx/xqueue
133+
* XQueue Watcher Repository: https://github.com/openedx/xqueue-watcher
134+
135+
Related Repositories:
136+
* edx-submissions: https://github.com/openedx/edx-submissions
137+
* edx-platform: https://github.com/openedx/edx-platform
138+
* XQueue Repository: https://github.com/openedx/xqueue
139+
* edx-submissions: https://github.com/openedx/edx-submissions
140+
141+
Related Documentation:
142+
* ADR 1: Creation of SubmissionQueueRecord Model for XQueue Migration
143+
* File Processing Documentation: https://github.com/openedx/edx-submissions/tree/master/docs
144+
145+
Future Event Integration:
146+
* Open edX Events Framework: https://github.com/openedx/openedx-events
147+
* Event Bus Documentation: https://openedx.atlassian.net/wiki/spaces/AC/pages/124125264/Event+Bus
148+
149+
Related Architecture Documents:
150+
* Open edX Architecture Guidelines: https://openedx.atlassian.net/wiki/spaces/AC/pages/124125264/Architecture+Guidelines

Diff for: submissions/api.py

+7
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@
2828
ScoreSummary,
2929
StudentItem,
3030
Submission,
31+
SubmissionFileManager,
3132
score_reset,
3233
score_set
3334
)
@@ -83,6 +84,12 @@ def create_external_grader_detail(student_item_dict, answer, **external_grader_d
8384
grader_file_name=external_grader_detail.get('grader_file_name', ''),
8485
points_possible=external_grader_detail.get('points_possible', 1),
8586
)
87+
88+
files_dict = external_grader_detail.get('files')
89+
if files_dict:
90+
file_manager = SubmissionFileManager(instance)
91+
file_manager.process_files(files_dict)
92+
8693
return instance
8794

8895
except DatabaseError as error:

Diff for: submissions/migrations/0005_submissionfile.py

+32
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Generated by Django 4.2.17 on 2025-03-14 22:35
2+
3+
from django.db import migrations, models
4+
import django.db.models.deletion
5+
import django.utils.timezone
6+
import submissions.models
7+
import uuid
8+
9+
10+
class Migration(migrations.Migration):
11+
12+
dependencies = [
13+
('submissions', '0004_externalgraderdetail'),
14+
]
15+
16+
operations = [
17+
migrations.CreateModel(
18+
name='SubmissionFile',
19+
fields=[
20+
('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
21+
('uid', models.UUIDField(default=uuid.uuid4, editable=False)),
22+
('file', models.FileField(max_length=512, upload_to=submissions.models.submission_file_path)),
23+
('original_filename', models.CharField(max_length=255)),
24+
('created_at', models.DateTimeField(default=django.utils.timezone.now)),
25+
('submission_queue', models.ForeignKey(null=True, on_delete=django.db.models.deletion.SET_NULL,
26+
related_name='files', to='submissions.externalgraderdetail')),
27+
],
28+
options={
29+
'indexes': [models.Index(fields=['submission_queue', 'uid'], name='submissions_submiss_b1d4b0_idx')],
30+
},
31+
),
32+
]

Diff for: submissions/models.py

+102
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,14 @@
1010
"""
1111

1212
import logging
13+
import os
1314
from datetime import timedelta
1415
from uuid import uuid4
1516

1617
from django.conf import settings
1718
from django.contrib import auth
19+
from django.core.files.base import ContentFile
20+
from django.core.files.uploadedfile import SimpleUploadedFile
1821
from django.db import DatabaseError, models, transaction
1922
from django.db.models.signals import post_save, pre_save
2023
from django.dispatch import Signal, receiver
@@ -696,3 +699,102 @@ def update_status(self, new_status):
696699
def create_from_uuid(cls, submission_uuid, **kwargs):
697700
submission = Submission.objects.get(uuid=submission_uuid)
698701
return cls.objects.create(submission=submission, **kwargs)
702+
703+
704+
def submission_file_path(instance, _):
705+
"""
706+
Generate file path for submission files.
707+
Format: queue_name/uuid
708+
The filename is replaced with the UUID to ensure uniqueness without preserving extension.
709+
"""
710+
return os.path.join(
711+
instance.submission_queue.queue_name,
712+
f"{instance.uid}"
713+
)
714+
715+
716+
class SubmissionFile(models.Model):
717+
"""
718+
Model to handle files associated with submissions
719+
"""
720+
uid = models.UUIDField(default=uuid4, editable=False)
721+
# legacy S3 key
722+
submission_queue = models.ForeignKey(
723+
'submissions.ExternalGraderDetail',
724+
on_delete=models.SET_NULL,
725+
related_name='files',
726+
null=True,
727+
)
728+
file = models.FileField(
729+
upload_to=submission_file_path,
730+
max_length=512
731+
)
732+
original_filename = models.CharField(max_length=255) # This is necessary to send file name to xqueue-watcher
733+
created_at = models.DateTimeField(default=now)
734+
735+
class Meta:
736+
indexes = [
737+
models.Index(fields=['submission_queue', 'uid']),
738+
]
739+
740+
@property
741+
def xqueue_url(self):
742+
"""
743+
Returns URL in xqueue format: /queue_name/uid
744+
"""
745+
return f"/{self.submission_queue.queue_name}/{self.uid}"
746+
747+
748+
class SubmissionFileManager:
749+
"""
750+
Manages file operations for submissions
751+
"""
752+
753+
def __init__(self, submission_queue):
754+
self.submission_queue = submission_queue
755+
756+
def process_files(self, files_dict):
757+
"""
758+
Process uploaded files.
759+
Returns URLs in xqueue compatible format.
760+
"""
761+
files_urls = {}
762+
for filename, file_obj in files_dict.items():
763+
if not (isinstance(file_obj, (bytes, ContentFile, SimpleUploadedFile)) or hasattr(file_obj, 'read')):
764+
logger.warning(f"Invalid file object type for {filename}")
765+
continue
766+
767+
if hasattr(file_obj, 'read'):
768+
try:
769+
file_content = file_obj.read()
770+
if isinstance(file_content, bytes):
771+
file_obj = ContentFile(file_content, name=filename)
772+
else:
773+
continue
774+
except (IOError, OSError) as e:
775+
logger.error(f"Error reading file {filename}: {e}")
776+
continue
777+
except UnicodeDecodeError as e:
778+
logger.error(f"Error decoding file {filename}: {e}")
779+
continue
780+
781+
if isinstance(file_obj, bytes):
782+
file_obj = ContentFile(file_obj, name=filename)
783+
784+
submission_file = SubmissionFile.objects.create(
785+
submission_queue=self.submission_queue,
786+
file=file_obj,
787+
original_filename=filename
788+
)
789+
files_urls[filename] = submission_file.xqueue_url
790+
791+
return files_urls
792+
793+
def get_files_for_grader(self):
794+
"""
795+
Returns files in format expected by xwatcher
796+
"""
797+
return {
798+
file.original_filename: file.xqueue_url
799+
for file in self.submission_queue.files.all()
800+
}

Diff for: submissions/tests/test_api.py

+53
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
import pytz
1111
# Django imports
1212
from django.core.cache import cache
13+
from django.core.files.uploadedfile import SimpleUploadedFile
1314
from django.db import DatabaseError, IntegrityError, connection, transaction
1415
from django.test import TestCase
1516
from django.utils.timezone import now
@@ -924,3 +925,55 @@ def test_create_multiple_submission_external_grader_details(self):
924925
self.assertEqual(submission2.answer, ANSWER_TWO)
925926

926927
self.assertNotEqual(external_grader_detail1.submission.uuid, external_grader_detail2.submission.uuid)
928+
929+
def test_create_external_grader_detail_with_files(self):
930+
"""Test creating a queue record with file handling."""
931+
932+
test_file = SimpleUploadedFile(
933+
"test.txt",
934+
b"test content",
935+
content_type="text/plain"
936+
)
937+
938+
event_data = {
939+
'queue_name': 'test_queue',
940+
'files': {'test.txt': test_file}
941+
}
942+
943+
queue_record = api.create_external_grader_detail(STUDENT_ITEM, ANSWER_ONE, **event_data)
944+
945+
self.assertEqual(queue_record.queue_name, 'test_queue')
946+
self.assertEqual(queue_record.files.count(), 1)
947+
submission_file = queue_record.files.first()
948+
self.assertEqual(submission_file.original_filename, 'test.txt')
949+
950+
def test_create_external_grader_detail_with_multiple_files(self):
951+
"""Test creating a queue record with multiple files."""
952+
test_files = {
953+
'test1.txt': SimpleUploadedFile(
954+
"test1.txt",
955+
b"test content 1",
956+
content_type="text/plain"
957+
),
958+
'test2.txt': SimpleUploadedFile(
959+
"test2.txt",
960+
b"test content 2",
961+
content_type="text/plain"
962+
)
963+
}
964+
965+
external_grader_detail = {
966+
'queue_name': 'test_queue',
967+
'files': test_files
968+
}
969+
queue_record = api.create_external_grader_detail(STUDENT_ITEM, ANSWER_ONE, **external_grader_detail)
970+
971+
self.assertEqual(queue_record.files.count(), 2)
972+
filenames = set(queue_record.files.values_list('original_filename', flat=True))
973+
self.assertEqual(filenames, {'test1.txt', 'test2.txt'})
974+
975+
def test_create_external_grader_detail_without_files(self):
976+
"""Test creating a queue record without any files still works."""
977+
external_grader_instance = api.create_external_grader_detail(STUDENT_ITEM, ANSWER_ONE, queue_name="test_queue")
978+
self.assertEqual(external_grader_instance.queue_name, 'test_queue')
979+
self.assertEqual(external_grader_instance.files.count(), 0)

0 commit comments

Comments
 (0)