Skip to content

[Failing Test]: Running which java seems to cause flakes in some YAML tests on GHA runners #30854

@tvalentyn

Description

@tvalentyn

What happened?

In stacktrace below, process seems to got stuck when running subprocess.run(['which', 'java']. Filing to track this issue if it is common.

cc: @Polber

________________ AggregationTest.test_combine_mean_minimal_yaml ________________
[gw1] linux -- Python 3.9.19 /runner/_work/beam/beam/sdks/python/test-suites/tox/py39/build/srcs/sdks/python/target/.tox-py39/py39/bin/python

self = <apache_beam.yaml.examples.examples_test.AggregationExamplesTest testMethod=test_combine_mean_minimal_yaml>

    @mock.patch('apache_beam.Pipeline', TestPipeline)
    def test_yaml_example(self):
      with open(pipeline_spec_file, encoding="utf-8") as f:
        lines = f.readlines()
      expected_key = '# Expected:\n'
      if expected_key in lines:
        expected = lines[lines.index('# Expected:\n') + 1:]
      else:
        raise ValueError(
            f"Missing '# Expected:' tag in example file '{pipeline_spec_file}'")
      for i, line in enumerate(expected):
        expected[i] = line.replace('#  ', '').replace('\n', '')
      pipeline_spec = yaml.load(
          ''.join(lines), Loader=yaml_transform.SafeLineLoader)
    
      with TestEnvironment() as env:
        if custom_preprocessor:
          pipeline_spec = custom_preprocessor(pipeline_spec, expected, env)
        with beam.Pipeline(options=PipelineOptions(
            pickle_library='cloudpickle',
            **yaml_transform.SafeLineLoader.strip_metadata(pipeline_spec.get(
                'options', {})))) as p:
>         actual = yaml_transform.expand_pipeline(p, pipeline_spec)

apache_beam/yaml/examples/examples_test.py:77: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
target/.tox-py39/py39/lib/python3.9/site-packages/apache_beam/yaml/yaml_transform.py:1035: in expand_pipeline
    return YamlTransform(
target/.tox-py39/py39/lib/python3.9/site-packages/apache_beam/yaml/yaml_transform.py:1006: in expand
    result = expand_transform(
target/.tox-py39/py39/lib/python3.9/site-packages/apache_beam/yaml/yaml_transform.py:455: in expand_transform
    return expand_composite_transform(spec, scope)
target/.tox-py39/py39/lib/python3.9/site-packages/apache_beam/yaml/yaml_transform.py:529: in expand_composite_transform
    return CompositePTransform.expand(None)
target/.tox-py39/py39/lib/python3.9/site-packages/apache_beam/yaml/yaml_transform.py:520: in expand
    inner_scope.compute_all()
target/.tox-py39/py39/lib/python3.9/site-packages/apache_beam/yaml/yaml_transform.py:247: in compute_all
    self.compute_outputs(transform_id)
target/.tox-py39/py39/lib/python3.9/site-packages/apache_beam/yaml/yaml_transform.py:96: in wrapper
    self._cache[key] = func(self, *args)
target/.tox-py39/py39/lib/python3.9/site-packages/apache_beam/yaml/yaml_transform.py:283: in compute_outputs
    return expand_transform(self._transforms_by_uuid[transform_id], self)
target/.tox-py39/py39/lib/python3.9/site-packages/apache_beam/yaml/yaml_transform.py:457: in expand_transform
    return expand_leaf_transform(spec, scope)
target/.tox-py39/py39/lib/python3.9/site-packages/apache_beam/yaml/yaml_transform.py:479: in expand_leaf_transform
    ptransform = scope.create_ptransform(spec, inputs_dict.values())
target/.tox-py39/py39/lib/python3.9/site-packages/apache_beam/yaml/yaml_transform.py:373: in create_ptransform
    provider = self.best_provider(spec, input_providers)
                else:
                    env_list = None  # Use execv instead of execve.
                executable = os.fsencode(executable)
                if os.path.dirname(executable):
                    executable_list = (executable,)
                else:
                    # This matches the behavior of os._execvpe().
                    executable_list = tuple(
                        os.path.join(os.fsencode(dir), executable)
                        for dir in os.get_exec_path(env))
                fds_to_keep = set(pass_fds)
                fds_to_keep.add(errpipe_write)
                self.pid = _posixsubprocess.fork_exec(
                        args, executable_list,
                        close_fds, tuple(sorted(map(int, fds_to_keep))),
                        cwd, env_list,
                        p2cread, p2cwrite, c2pread, c2pwrite,
                        errread, errwrite,
                        errpipe_read, errpipe_write,
                        restore_signals, start_new_session,
                        gid, gids, uid, umask,
                        preexec_fn)
                self._child_created = True
            finally:
                # be sure the FD is closed no matter what
                os.close(errpipe_write)
    
            self._close_pipe_fds(p2cread, p2cwrite,
                                 c2pread, c2pwrite,
                                 errread, errwrite)
    
            # Wait for exec to fail or succeed; possibly raising an
            # exception (limited in size)
            errpipe_data = bytearray()
            while True:
>               part = os.read(errpipe_read, 50000)
E               Failed: Timeout >600.0s

/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/subprocess.py:1793: Failed
----------------------------- Captured stdout call -----------------------------

Taken from https://github.com/apache/beam/actions/runs/8548288262/job/23421776246?pr=30843

Issue Failure

Failure: Test is flaky

Issue Priority

Priority: 2 (backlog / disabled test but we think the product is healthy)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions