Skip to content

Conversation

Pearl1594
Copy link
Contributor

Description

Fixes: #11252
This PR fixes issue with deploying CKS clusters in basic zones. This happens because when deploying the control node, an IP from the shared network IP pool is allocated to it i.e., marked as allocated, but allocatedTime isn't set. When attempt is then made to reserve an IP for the virtual router, it runs the following query:
SELECT u.* FROM user_ip_address u INNER JOIN vlan v ON v.id = u.vlan_db_id INNER JOIN pod_vlan_map pvm ON pvm.vlan_db_id = u.vlan_db_id WHERE u.data_center_id = ? AND v.vlan_type = "DirectAttached" AND v.network_id = ? AND u.allocated IS NULL and u.vlan_db_id = ? AND u.forsystemvms = 0 ORDER BY u.forsystemvms ASC, u.vlan_db_id ASC LIMIT = 1

where it gets the 1st IP that has allocated (allocatedtime) as null, so it returns an IP that has already been assigned to the control node.
To address the issue, if the allocated time is set along with updating the state, this issue will not be encountered.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

Deployed CKS cluster in a basic zone
image

image

How did you try to break this feature and the system with this change?

@Pearl1594
Copy link
Contributor Author

@blueorangutan package

Copy link

codecov bot commented Aug 15, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 16.17%. Comparing base (c6daeb4) to head (5af3447).

Additional details and impacted files
@@             Coverage Diff              @@
##               4.20   #11457      +/-   ##
============================================
- Coverage     16.17%   16.17%   -0.01%     
- Complexity    13285    13286       +1     
============================================
  Files          5656     5656              
  Lines        498000   498001       +1     
  Branches      60401    60401              
============================================
- Hits          80539    80534       -5     
- Misses       408496   408505       +9     
+ Partials       8965     8962       -3     
Flag Coverage Δ
uitests 4.00% <ø> (ø)
unittests 17.02% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@sureshanaparti sureshanaparti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

@sureshanaparti
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

Copy link

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 14643

@Pearl1594
Copy link
Contributor Author

@blueorangutan test

@blueorangutan
Copy link

@Pearl1594 a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@DaanHoogland
Copy link
Contributor

@blueorangutan test keepEnv basicZone

@blueorangutan
Copy link

@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-14069)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 20801 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr11457-t14069-kvm-ol8.zip
Smoke tests completed. 127 look OK, 14 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_01_events_resource Error 7.16 test_events_resource.py
test_replace_acl_of_network Error 0.43 test_global_acls.py
test_deploy_more_vms_than_limit_allows Failure 55.50 test_deploy_vms_in_parallel.py
test_01_create_ipv6_public_ip_range Error 0.03 test_ipv6_infra.py
test_08_listsystemvms Failure 0.04 test_list_volumes.py
ContextSuite context=TestIpv6Network>:setup Error 0.00 test_network_ipv6.py
test_create_role Error 7.04 test_private_roles.py
test_create_role Error 7.04 test_private_roles.py
test_another_user_can_allocate_ip_after_quarantined_has_ended_network Error 7.60 test_quarantined_ips.py
test_another_user_can_allocate_ip_after_quarantined_has_ended_vpc Error 0.41 test_quarantined_ips.py
test_only_owner_can_allocate_ip_in_quarantine_network Error 0.43 test_quarantined_ips.py
test_only_owner_can_allocate_ip_in_quarantine_vpc Error 0.41 test_quarantined_ips.py
test_CRUD_operations_userdata Error 20.71 test_register_userdata.py
test_deploy_vm_with_registered_userdata Error 5.54 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_override_policy_allow Error 5.44 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_override_policy_append Error 5.52 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_override_policy_deny Error 5.51 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_params Error 5.34 test_register_userdata.py
test_link_and_unlink_userdata_to_template Error 5.50 test_register_userdata.py
test_user_userdata_crud Error 5.42 test_register_userdata.py
test_03_restart_network_cleanup Error 1.14 test_routers.py
ContextSuite context=TestISOUsage>:setup Error 0.00 test_usage.py
test_13_migrate_volume_and_change_offering Error 131.59 test_volumes.py
ContextSuite context=TestIpv6Vpc>:setup Error 0.00 test_vpc_ipv6.py
test_02_cancel_host_maintenace_with_migration_jobs Error 0.33 test_host_maintenance.py
test_03_cancel_host_maintenace_with_migration_jobs_failure Error 0.37 test_host_maintenance.py
ContextSuite context=TestHostMaintenanceAgents>:setup Error 0.60 test_host_maintenance.py

@blueorangutan
Copy link

[SF] Trillian test result (tid-14068)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 57056 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr11457-t14068-kvm-ol8.zip
Smoke tests completed. 131 look OK, 10 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_13_retrieve_vr_default_files Error 1.15 test_diagnostics.py
test_14_retrieve_vr_one_file Error 1.13 test_diagnostics.py
test_15_retrieve_ssvm_default_files Error 1.13 test_diagnostics.py
test_16_retrieve_ssvm_single_file Error 1.16 test_diagnostics.py
test_17_retrieve_cpvm_default_files Error 1.14 test_diagnostics.py
test_18_retrieve_cpvm_single_file Error 1.15 test_diagnostics.py
test_03_purge_expunged_api_vm_start_end_date Error 2.84 test_purge_expunged_vms.py
test_04_purge_expunged_api_vm_no_date Error 1.74 test_purge_expunged_vms.py
test_05_purge_expunged_vm_service_offering Error 1.41 test_purge_expunged_vms.py
test_06_purge_expunged_vm_background_task Error 346.39 test_purge_expunged_vms.py
test_deploy_vm_with_registered_userdata_with_override_policy_append Error 7.68 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_override_policy_append Error 7.69 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_params Error 7.36 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_params Error 7.37 test_register_userdata.py
ContextSuite context=TestRouterDHCPHosts>:setup Error 0.00 test_router_dhcphosts.py
ContextSuite context=TestRouterDHCPOpts>:setup Error 0.00 test_router_dhcphosts.py
test_02_list_cpvm_vm Failure 0.04 test_ssvm.py
test_04_cpvm_internals Failure 0.07 test_ssvm.py
test_01_create_volume Error 254.91 test_volumes.py
test_01_root_volume_encryption Error 1.42 test_volumes.py
test_02_data_volume_encryption Error 1.26 test_volumes.py
test_03_root_and_data_volume_encryption Error 1.36 test_volumes.py
ContextSuite context=TestVolumes>:setup Error 33.12 test_volumes.py
ContextSuite context=TestIpv6Vpc>:setup Error 0.00 test_vpc_ipv6.py
test_01_create_redundant_VPC_2tiers_4VMs_4IPs_4PF_ACL Failure 80.14 test_vpc_redundant.py
test_02_redundant_VPC_default_routes Error 13.40 test_vpc_redundant.py
test_01_redundant_vpc_site2site_vpn Failure 281.82 test_vpc_vpn.py
test_01_redundant_vpc_site2site_vpn Error 281.84 test_vpc_vpn.py
test_02_cancel_host_maintenace_with_migration_jobs Error 1.59 test_host_maintenance.py
test_03_cancel_host_maintenace_with_migration_jobs_failure Error 1.66 test_host_maintenance.py
test_01_cancel_host_maintenance_ssh_enabled_agent_connected Failure 18.65 test_host_maintenance.py
test_03_cancel_host_maintenance_ssh_disabled_agent_connected Failure 21.68 test_host_maintenance.py
test_04_cancel_host_maintenance_ssh_disabled_agent_disconnected Failure 29.36 test_host_maintenance.py
ContextSuite context=TestHostMaintenanceAgents>:teardown Error 30.50 test_host_maintenance.py

Copy link

@rajujith rajujith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Tested. I was able to create a CKS cluster in a basic zone.

@weizhouapache
Copy link
Member

@blueorangutan test

@blueorangutan
Copy link

@weizhouapache a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-14079)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 54782 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr11457-t14079-kvm-ol8.zip
Smoke tests completed. 140 look OK, 1 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_01_deployVMInSharedNetwork Failure 430.76 test_network.py

Copy link
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, thanks for the explanation @Pearl1594 clgtm

Copy link
Member

@weizhouapache weizhouapache left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@blueorangutan
Copy link

[SF] Trillian test result (tid-14080)
Environment: kvm-ol8 (x2), zone: Basic Networking with Mgmt server ol8
Total time taken: 22124 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr11457-t14080-kvm-ol8.zip
Smoke tests completed. 125 look OK, 16 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_01_events_resource Error 6.90 test_events_resource.py
test_replace_acl_of_network Error 0.43 test_global_acls.py
test_deploy_more_vms_than_limit_allows Failure 54.38 test_deploy_vms_in_parallel.py
test_01_create_ipv6_public_ip_range Error 0.04 test_ipv6_infra.py
test_08_listsystemvms Failure 0.04 test_list_volumes.py
ContextSuite context=TestIpv6Network>:setup Error 0.00 test_network_ipv6.py
test_create_role Error 6.65 test_private_roles.py
test_create_role Error 6.66 test_private_roles.py
test_another_user_can_allocate_ip_after_quarantined_has_ended_network Error 6.47 test_quarantined_ips.py
test_another_user_can_allocate_ip_after_quarantined_has_ended_vpc Error 0.40 test_quarantined_ips.py
test_only_owner_can_allocate_ip_in_quarantine_network Error 0.36 test_quarantined_ips.py
test_only_owner_can_allocate_ip_in_quarantine_vpc Error 0.42 test_quarantined_ips.py
test_CRUD_operations_userdata Error 20.76 test_register_userdata.py
test_deploy_vm_with_registered_userdata Error 5.00 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_override_policy_allow Error 4.51 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_override_policy_append Error 5.25 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_override_policy_deny Error 4.81 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_params Error 4.65 test_register_userdata.py
test_link_and_unlink_userdata_to_template Error 4.69 test_register_userdata.py
test_user_userdata_crud Error 5.04 test_register_userdata.py
test_03_restart_network_cleanup Error 1.19 test_routers.py
ContextSuite context=TestVMWareStoragePolicies>:setup Error 0.00 test_storage_policy.py
ContextSuite context=TestTemplates>:setup Error 0.00 test_templates.py
ContextSuite context=TestISOUsage>:setup Error 0.00 test_usage.py
ContextSuite context=TestVolumes>:setup Error 694.29 test_volumes.py
ContextSuite context=TestIpv6Vpc>:setup Error 0.00 test_vpc_ipv6.py
test_02_cancel_host_maintenace_with_migration_jobs Error 0.34 test_host_maintenance.py
test_03_cancel_host_maintenace_with_migration_jobs_failure Error 0.40 test_host_maintenance.py
ContextSuite context=TestHostMaintenanceAgents>:setup Error 0.61 test_host_maintenance.py

@sureshanaparti sureshanaparti merged commit 6e59f4f into 4.20 Aug 21, 2025
64 of 66 checks passed
@sureshanaparti sureshanaparti deleted the fix-cks-basic-zone branch August 21, 2025 13:02
dhslove pushed a commit to ablecloud-team/ablestack-cloud that referenced this pull request Sep 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failed to fetch any free public IP address on K8S cluster creation over Basic Network
6 participants