Skip to content

Commit b9ce70a

Browse files
committed
feat: implement submission file workflow with comprehensive testing
- Create Submission file model - Develop SubmissionFileManager for centralized DB query handling - Add TestSubmissionFileManager with extensive test coverage - Test file processing with different input types (bytes, file objects) - Implement error handling tests for IO errors and invalid files - Verify complete file processing workflow - Validation tests for file addition in create_external_grader_detail - Document architectural decisions with ADR - Add test queue folder in .gitignore
1 parent 2413e6a commit b9ce70a

File tree

8 files changed

+665
-0
lines changed

8 files changed

+665
-0
lines changed

Diff for: .gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -63,3 +63,4 @@ test.txt.tmp
6363

6464
# VSCode
6565
.vscode
66+
test_queue/*
+150
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,150 @@
1+
2. File Handling Implementation for Submission System
2+
#####################################################
3+
4+
Status
5+
******
6+
7+
**Provisional** *2025-02-14*
8+
9+
Implemented by: waiting for it
10+
11+
Context
12+
*******
13+
14+
As part of the XQueue migration effort detailed in ADR 1, we need to handle file submissions that were previously
15+
managed by the XQueue system. The existing implementation lacks dedicated file handling capabilities within
16+
edx-submissions, requiring:
17+
18+
1. File Management:
19+
- Secure storage of student-submitted files
20+
- Association with specific submissions
21+
- Compatibility with XQueue URL formats
22+
- Support for multiple file types and formats
23+
24+
2. Processing Requirements:
25+
- Robust error handling for file operations
26+
- Validation of file objects
27+
- Dynamic path generation
28+
- Proper cleanup and resource management
29+
30+
3. Integration Points:
31+
- Connection with SubmissionQueueRecord
32+
- Support for existing xqueue-watcher interface
33+
- File URL format compatibility
34+
- Efficient database querying
35+
36+
The current XQueue system implements file handling through its Submission model, which manages file storage and
37+
retrieval in a tightly coupled way. The existing implementation:
38+
39+
1. Current File Management in XQueue:
40+
- Uses a Submission model with direct file storage fields (s3_keys and s3_urls)
41+
- Handles file uploads through manual storage management functions (upload_file_dict, upload)
42+
- Stores file URLs as JSON strings in model fields
43+
- Requires synchronous HTTP communication for file retrieval
44+
45+
2. Submission Model File Fields:
46+
```python
47+
s3_keys = models.CharField(max_length=CHARFIELD_LEN_LARGE) # keys for internal use
48+
s3_urls = models.CharField(max_length=CHARFIELD_LEN_LARGE) # urls for external access
49+
```
50+
51+
3. Current Workflow Issues:
52+
- File handling is tightly coupled with submission processing
53+
- Relies on manual URL construction and string manipulation
54+
- Lacks proper file validation and type checking
55+
- Limited by CharField size for storing file information
56+
- Requires complex get_submission logic for file retrieval
57+
58+
4. Xqueue-watcher Integration:
59+
- External graders rely on specific URL formats
60+
- File access depends on HTTP-based retrieval
61+
- File information is embedded in submission payload
62+
63+
Decision
64+
********
65+
66+
We will implement a new SubmissionFile model and supporting infrastructure to handle file management within
67+
edx-submissions:
68+
69+
1. Core Model Structure:
70+
- SubmissionFile model with UUID-based identification
71+
- Foreign key relationship to SubmissionQueueRecord
72+
- Django FileField for actual file storage
73+
- Original filename preservation
74+
- Creation timestamp tracking
75+
- Composite index for efficient querying
76+
77+
2. File Management Layer:
78+
- SubmissionFileManager class for encapsulated file operations
79+
- Robust file processing with multiple format support
80+
- Error handling for various file-related exceptions
81+
- XQueue-compatible URL generation
82+
83+
3. Integration with Submission Creation:
84+
- Extended create_external_grader_detail function
85+
- File processing during submission creation
86+
- Automatic file manager instantiation
87+
- Error handling and logging
88+
89+
Consequences
90+
************
91+
92+
Positive:
93+
---------
94+
95+
1. Architecture:
96+
- Clean separation of concerns for file handling
97+
- Improved maintainability through dedicated models
98+
- Better error handling and logging
99+
- Efficient database querying through proper indexing
100+
101+
2. Integration:
102+
- Seamless xqueue-watcher compatibility
103+
- Support for existing file processing workflows
104+
- Minimal changes required to client code
105+
106+
3. Operations:
107+
- More robust file processing
108+
- Better tracking of file submissions
109+
- Improved error visibility
110+
- Simplified file management
111+
112+
Negative:
113+
---------
114+
115+
1. Technical Impact:
116+
- Additional database tables and indexes
117+
- Increased storage requirements
118+
- More complex submission creation flow
119+
120+
2. Migration Considerations:
121+
- Temporary increased system complexity
122+
- Additional testing requirements
123+
124+
3. Performance:
125+
- Additional database operations during submission
126+
- File processing overhead
127+
128+
References
129+
**********
130+
131+
Current System Documentation:
132+
* XQueue Repository: https://github.com/openedx/xqueue
133+
* XQueue Watcher Repository: https://github.com/openedx/xqueue-watcher
134+
135+
Related Repositories:
136+
* edx-submissions: https://github.com/openedx/edx-submissions
137+
* edx-platform: https://github.com/openedx/edx-platform
138+
* XQueue Repository: https://github.com/openedx/xqueue
139+
* edx-submissions: https://github.com/openedx/edx-submissions
140+
141+
Related Documentation:
142+
* ADR 1: Creation of SubmissionQueueRecord Model for XQueue Migration
143+
* File Processing Documentation: https://github.com/openedx/edx-submissions/tree/master/docs
144+
145+
Future Event Integration:
146+
* Open edX Events Framework: https://github.com/openedx/openedx-events
147+
* Event Bus Documentation: https://openedx.atlassian.net/wiki/spaces/AC/pages/124125264/Event+Bus
148+
149+
Related Architecture Documents:
150+
* Open edX Architecture Guidelines: https://openedx.atlassian.net/wiki/spaces/AC/pages/124125264/Architecture+Guidelines

Diff for: submissions/api.py

+7
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@
2828
ScoreSummary,
2929
StudentItem,
3030
Submission,
31+
SubmissionFileManager,
3132
score_reset,
3233
score_set
3334
)
@@ -96,6 +97,12 @@ def create_external_grader_detail(student_item_dict,
9697
points_possible=points_possible,
9798

9899
)
100+
101+
files_dict = external_grader_additional_data.get('files')
102+
if files_dict:
103+
file_manager = SubmissionFileManager(instance)
104+
file_manager.process_files(files_dict)
105+
99106
return instance
100107

101108
except DatabaseError as error:

Diff for: submissions/migrations/0005_submissionfile.py

+32
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Generated by Django 4.2.17 on 2025-03-14 22:35
2+
3+
from django.db import migrations, models
4+
import django.db.models.deletion
5+
import django.utils.timezone
6+
import submissions.models
7+
import uuid
8+
9+
10+
class Migration(migrations.Migration):
11+
12+
dependencies = [
13+
('submissions', '0004_externalgraderdetail'),
14+
]
15+
16+
operations = [
17+
migrations.CreateModel(
18+
name='SubmissionFile',
19+
fields=[
20+
('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
21+
('uid', models.UUIDField(default=uuid.uuid4, editable=False)),
22+
('file', models.FileField(max_length=512, upload_to=submissions.models.submission_file_path)),
23+
('original_filename', models.CharField(max_length=255)),
24+
('created_at', models.DateTimeField(default=django.utils.timezone.now)),
25+
('submission_queue', models.ForeignKey(null=True, on_delete=django.db.models.deletion.SET_NULL,
26+
related_name='files', to='submissions.externalgraderdetail')),
27+
],
28+
options={
29+
'indexes': [models.Index(fields=['submission_queue', 'uid'], name='submissions_submiss_b1d4b0_idx')],
30+
},
31+
),
32+
]

Diff for: submissions/models.py

+102
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,14 @@
1010
"""
1111

1212
import logging
13+
import os
1314
from datetime import timedelta
1415
from uuid import uuid4
1516

1617
from django.conf import settings
1718
from django.contrib import auth
19+
from django.core.files.base import ContentFile
20+
from django.core.files.uploadedfile import SimpleUploadedFile
1821
from django.db import DatabaseError, models, transaction
1922
from django.db.models.signals import post_save, pre_save
2023
from django.dispatch import Signal, receiver
@@ -693,3 +696,102 @@ def update_status(self, new_status):
693696
def create_from_uuid(cls, submission_uuid, **kwargs):
694697
submission = Submission.objects.get(uuid=submission_uuid)
695698
return cls.objects.create(submission=submission, **kwargs)
699+
700+
701+
def submission_file_path(instance, _):
702+
"""
703+
Generate file path for submission files.
704+
Format: queue_name/uuid
705+
The filename is replaced with the UUID to ensure uniqueness without preserving extension.
706+
"""
707+
return os.path.join(
708+
instance.submission_queue.queue_name,
709+
f"{instance.uid}"
710+
)
711+
712+
713+
class SubmissionFile(models.Model):
714+
"""
715+
Model to handle files associated with submissions
716+
"""
717+
uid = models.UUIDField(default=uuid4, editable=False)
718+
# legacy S3 key
719+
submission_queue = models.ForeignKey(
720+
'submissions.ExternalGraderDetail',
721+
on_delete=models.SET_NULL,
722+
related_name='files',
723+
null=True,
724+
)
725+
file = models.FileField(
726+
upload_to=submission_file_path,
727+
max_length=512
728+
)
729+
original_filename = models.CharField(max_length=255) # This is necessary to send file name to xqueue-watcher
730+
created_at = models.DateTimeField(default=now)
731+
732+
class Meta:
733+
indexes = [
734+
models.Index(fields=['submission_queue', 'uid']),
735+
]
736+
737+
@property
738+
def xqueue_url(self):
739+
"""
740+
Returns URL in xqueue format: /queue_name/uid
741+
"""
742+
return f"/{self.submission_queue.queue_name}/{self.uid}"
743+
744+
745+
class SubmissionFileManager:
746+
"""
747+
Manages file operations for submissions
748+
"""
749+
750+
def __init__(self, submission_queue):
751+
self.submission_queue = submission_queue
752+
753+
def process_files(self, files_dict):
754+
"""
755+
Process uploaded files.
756+
Returns URLs in xqueue compatible format.
757+
"""
758+
files_urls = {}
759+
for filename, file_obj in files_dict.items():
760+
if not (isinstance(file_obj, (bytes, ContentFile, SimpleUploadedFile)) or hasattr(file_obj, 'read')):
761+
logger.warning(f"Invalid file object type for {filename}")
762+
continue
763+
764+
if hasattr(file_obj, 'read'):
765+
try:
766+
file_content = file_obj.read()
767+
if isinstance(file_content, bytes):
768+
file_obj = ContentFile(file_content, name=filename)
769+
else:
770+
continue
771+
except (IOError, OSError) as e:
772+
logger.error(f"Error reading file {filename}: {e}")
773+
continue
774+
except UnicodeDecodeError as e:
775+
logger.error(f"Error decoding file {filename}: {e}")
776+
continue
777+
778+
if isinstance(file_obj, bytes):
779+
file_obj = ContentFile(file_obj, name=filename)
780+
781+
submission_file = SubmissionFile.objects.create(
782+
submission_queue=self.submission_queue,
783+
file=file_obj,
784+
original_filename=filename
785+
)
786+
files_urls[filename] = submission_file.xqueue_url
787+
788+
return files_urls
789+
790+
def get_files_for_grader(self):
791+
"""
792+
Returns files in format expected by xwatcher
793+
"""
794+
return {
795+
file.original_filename: file.file.url
796+
for file in self.submission_queue.files.all()
797+
}

Diff for: submissions/tests/test_api.py

+53
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
import pytz
1111
# Django imports
1212
from django.core.cache import cache
13+
from django.core.files.uploadedfile import SimpleUploadedFile
1314
from django.db import DatabaseError, IntegrityError, connection, transaction
1415
from django.test import TestCase
1516
from django.utils.timezone import now
@@ -923,3 +924,55 @@ def test_create_multiple_submission_external_grader_details(self):
923924
self.assertEqual(submission2.answer, ANSWER_TWO)
924925

925926
self.assertNotEqual(external_grader_detail1.submission.uuid, external_grader_detail2.submission.uuid)
927+
928+
def test_create_external_grader_detail_with_files(self):
929+
"""Test creating a queue record with file handling."""
930+
931+
test_file = SimpleUploadedFile(
932+
"test.txt",
933+
b"test content",
934+
content_type="text/plain"
935+
)
936+
937+
event_data = {
938+
'queue_name': 'test_queue',
939+
'files': {'test.txt': test_file}
940+
}
941+
942+
external_grader_detail = api.create_external_grader_detail(STUDENT_ITEM, ANSWER_ONE, **event_data)
943+
944+
self.assertEqual(external_grader_detail.queue_name, 'test_queue')
945+
self.assertEqual(external_grader_detail.files.count(), 1)
946+
submission_file = external_grader_detail.files.first()
947+
self.assertEqual(submission_file.original_filename, 'test.txt')
948+
949+
def test_create_external_grader_detail_with_multiple_files(self):
950+
"""Test creating a queue record with multiple files."""
951+
test_files = {
952+
'test1.txt': SimpleUploadedFile(
953+
"test1.txt",
954+
b"test content 1",
955+
content_type="text/plain"
956+
),
957+
'test2.txt': SimpleUploadedFile(
958+
"test2.txt",
959+
b"test content 2",
960+
content_type="text/plain"
961+
)
962+
}
963+
964+
external_grader_detail = {
965+
'queue_name': 'test_queue',
966+
'files': test_files
967+
}
968+
external_grader_detail = api.create_external_grader_detail(STUDENT_ITEM, ANSWER_ONE, **external_grader_detail)
969+
970+
self.assertEqual(external_grader_detail.files.count(), 2)
971+
filenames = set(external_grader_detail.files.values_list('original_filename', flat=True))
972+
self.assertEqual(filenames, {'test1.txt', 'test2.txt'})
973+
974+
def test_create_external_grader_detail_without_files(self):
975+
"""Test creating a queue record without any files still works."""
976+
external_grader_instance = api.create_external_grader_detail(STUDENT_ITEM, ANSWER_ONE, queue_name="test_queue")
977+
self.assertEqual(external_grader_instance.queue_name, 'test_queue')
978+
self.assertEqual(external_grader_instance.files.count(), 0)

0 commit comments

Comments
 (0)