Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

24-3-15 can not start after main #15232

Closed
zverevgeny opened this issue Mar 3, 2025 · 5 comments · Fixed by #15878 · May be fixed by #15983 or #16071
Closed

24-3-15 can not start after main #15232

zverevgeny opened this issue Mar 3, 2025 · 5 comments · Fixed by #15878 · May be fixed by #15983 or #16071
Assignees

Comments

@zverevgeny
Copy link
Collaborator

zverevgeny commented Mar 3, 2025

https://github.com/ydb-platform/ydb/blob/stable-24-3-15-hotfix/ydb/core/tx/columnshard/engines/column_engine_logs.cpp#L167

база, на которой воспроизвелось: https://nda.ya.ru/t/i3LICXeu7CR5vb

Mar  3 06:32:05 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 0. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/util/system/yassert.cpp:83: NPrivate::InternalPanicImpl(int, char const*, char const*, int, int, int, TBasicStringBuf<char, std::__y1::char_traits<char>>, char const*, unsigned long) @ 0x55EB7AAE545C
Mar  3 06:32:05 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 1. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/util/system/yassert.cpp:55: NPrivate::Panic(NPrivate::TStaticBuf const&, int, char const*, char const*, char const*, ...) @ 0x55EB7AADDC1B
Mar  3 06:32:05 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 2. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/library/actors/core/log.cpp:747: NActors::TVerifyFormattedRecordWriter::~TVerifyFormattedRecordWriter() @ 0x55EB7B04EF14
Mar  3 06:32:05 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 3. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/column_engine_logs.cpp:167: NKikimr::NOlap::TColumnEngineForLogs::RegisterSchemaVersion(NKikimr::NOlap::TSnapshot const&, NKikimrSchemeOp::TColumnTableSchema const&) @ 0x55EB85BD7421
Mar  3 06:32:05 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 4. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/engines/column_engine_logs.cpp:38: NKikimr::NOlap::TColumnEngineForLogs::TColumnEngineForLogs(unsigned long, std::__y1::shared_ptr<NKikimr::NOlap::IStoragesManager> const&, NKikimr::NOlap::TSnapshot const&, NKikimrSchemeOp::TColumnTableSchema const&) @ 0x55EB85BD42B9
Mar  3 06:32:05 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 5. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/libs/cxxsupp/libcxx/include/__memory/unique_ptr.h:686: std::__y1::__unique_if<NKikimr::NOlap::TColumnEngineForLogs>::__unique_single std::__y1::make_unique[abi:v180000]<NKikimr::NOlap::TColumnEngineForLogs, unsigned long&, std::__y1::shared_ptr<NKikimr::NOlap::IStoragesManager>&, NKikimr::NOlap::TSnapshot, NKikimrSchemeOp::TColumnTableSchema const&>(unsigned long&, std::__y1::shared_ptr<NKikimr::NOlap::IStoragesManager>&, NKikimr::NOlap::TSnapshot&&, NKikimrSchemeOp::TColumnTableSchema const&) @ 0x55EB85E909B7
Mar  3 06:32:05 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 6. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/tables_manager.cpp:179: NKikimr::NColumnShard::TTablesManager::InitFromDB(NKikimr::NIceDb::TNiceDb&) @ 0x55EB85E909B7
Mar  3 06:32:05 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 7. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/normalizer/portion/normalizer.cpp:26: NKikimr::NOlap::TPortionsNormalizerBase::DoInit(NKikimr::NOlap::TNormalizationController const&, NKikimr::NTabletFlatExecutor::TTransactionContext&) @ 0x55EB85CE123E
Mar  3 06:32:05 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 8. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/normalizer/abstract/abstract.cpp:139: NKikimr::NOlap::TNormalizationController::INormalizerComponent::Init(NKikimr::NOlap::TNormalizationController const&, NKikimr::NTabletFlatExecutor::TTransactionContext&) @ 0x55EB85CDC34D
Mar  3 06:32:06 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 9. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tx/columnshard/columnshard__init.cpp:288: NKikimr::NColumnShard::TTxUpdateSchema::Execute(NKikimr::NTabletFlatExecutor::TTransactionContext&, NActors::TActorContext const&) @ 0x55EB85DE6165
Mar  3 06:32:06 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 10. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tablet_flat/flat_executor.cpp:1726: NKikimr::NTabletFlatExecutor::TExecutor::ExecuteTransaction(TAutoPtr<NKikimr::NTabletFlatExecutor::TSeat, TDelete>, NActors::TActorContext const&) @ 0x55EB7E9757CE
Mar  3 06:32:06 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 11. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tablet_flat/flat_executor.cpp:2647: NKikimr::NTabletFlatExecutor::TExecutor::Handle(TAutoPtr<NActors::TEventHandle<NKikimr::NTabletFlatExecutor::TExecutor::TEvPrivate::TEvActivateExecution>, TDelete>&, NActors::TActorContext const&) @ 0x55EB7E982201
Mar  3 06:32:06 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 12. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/core/tablet_flat/flat_executor.cpp:3971: NKikimr::NTabletFlatExecutor::TExecutor::StateWork(TAutoPtr<NActors::IEventHandle, TDelete>&) @ 0x55EB7E96379A
Mar  3 06:32:06 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 13. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:251: NActors::TGenericExecutorThread::TProcessingResult NActors::TGenericExecutorThread::Execute<NActors::TMailboxTable::TReadAsFilledMailbox>(NActors::TMailboxTable::TReadAsFilledMailbox*, unsigned int, bool) @ 0x55EB7B030738
Mar  3 06:32:06 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 14. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:441: NActors::TGenericExecutorThread::ProcessExecutorPool(NActors::IExecutorPool*)::$_0::operator()(unsigned int, bool) const @ 0x55EB7B024B8E
Mar  3 06:32:06 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 15. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:493: NActors::TGenericExecutorThread::ProcessExecutorPool(NActors::IExecutorPool*) @ 0x55EB7B0244DF
Mar  3 06:32:06 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 16. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/contrib/ydb/library/actors/core/executor_thread.cpp:524: NActors::TExecutorThread::ThreadProc() @ 0x55EB7B0253D6
Mar  3 06:32:06 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 17. /opt/buildagent/work/fc36a8b43d71f87e/__FUSE/mount_path/util/system/thread.cpp:244: (anonymous namespace)::TPosixThread::ThreadProxy(void*) @ 0x55EB7AAEB8C9
Mar  3 06:32:06 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 18. /build/glibc-e2p3jK/glibc-2.31/nptl/pthread_create.c:477: start_thread @ 0x7FF4FCE42608
Mar  3 06:32:06 vm-etn8i0ga1e2grsvoqvv9-ru-central1-a-hjjw-ydew kikimr[302691]: 19. ../sysdeps/unix/sysv/linux/x86_64/clone.S:95: ?? @ 0x7FF4FCD62352
@dorooleg
Copy link
Collaborator

  1. Почему запустился нормалайзер? (Первым шагом проверить конфиг)
  2. Почему стрельнула верифайка

@dorooleg
Copy link
Collaborator

Сценарий воспроизведения (ydb_experimental_vla/b1gtl2kg13him37quoo6/etn8i0ga1e2grsvoqvv9):

  1. Поставить main
  2. Записать в таблицу данные. Проверить что стоит флаг SuppressCompatibilityCheck: https://a.yandex-team.ru/arcadia/contrib/ydb/core/protos/feature_flags.proto?rev=r16123280#L128
  3. Откатить на 24-3-15 (только динамические ноды, storage оставить старым)

@aavdonkin
Copy link
Collaborator

Нормалайзеры запускаются, потому что в stable их меньше чем в main
Вот код, который за это отвечает: https://a.yandex-team.ru/arcadia/contrib/ydb/core/tx/columnshard/normalizer/abstract/abstract.cpp?rev=r16126919#L86

@aavdonkin
Copy link
Collaborator

Падение происходит от того что схема не может десериализоваться из-за этого условия

if (schema.GetEngine() != NKikimrSchemeOp::COLUMN_ENGINE_REPLACING_TIMESERIES) {

В main такого условия нет

@zverevgeny
Copy link
Collaborator Author

PR, в котором была внесена проблема: #10958

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment