![]() ![]() For example, Iceberg uses -size as a table property or read option to control the split size. Iceberg or Delta tables may have different settings to control the concurrency in the table scan stage. ![]() We can decrease its value to increase the number of tasks or partitions for this stage so that the memory pressure of each task is less. If it is a table scan stage on Parquet/ORC tables, then the number of tasks or partitions is normally determined by. Increase the number of tasks/partitions based on the type of the problematic stage Table Scan Stage If some tasks completed successfully while some tasks failed with OOM, check the amount of input bytes or shuffle bytes read per task to see if there is any data skew.Ĭheck the DAG of the problematic stage to see if there are any suspicious operators which may consume huge amounts of memory, such as windowing, collect_list/collect_set, explode, expand, etc. Then find the failed stage in the Stages page in the Spark UI, and go into that stage to look at tasks. First check the Spark UI to identify the problematic SQL ID, Job ID, and Stage ID. The relationship between SQL/job/stage is: Stage belongs to a Job which belongs to SQL. Identify which SQL, job and stage is involved in the error INFO GpuDeviceManager: Initializing RMM ASYNC pool size = 17840.349609375 MB on gpuId 0 ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |