力扣题解30. 串联所有单词的子串

Python&Java双语解决力扣必刷算法，题号30. 串联所有单词的子串

题目描述

解题思路

完整代码

Python

Java

题目描述

给定一个字符串 s 和一个字符串数组 words。 words 中所有字符串 长度相同。

s 中的 串联子串 是指一个包含 words 中所有字符串以任意顺序排列连接起来的子串。

例如，如果 words = ["ab","cd","ef"]，那么 "abcdef"， "abefcd"，"cdabef"， "cdefab"，"efabcd"，和 "efcdab" 都是串联子串。 "acdbef" 不是串联子串，因为他不是任何 words 排列的连接。

返回所有串联子串在 s 中的开始索引。你可以以 任意顺序 返回答案。

示例 1：

输入：s = "barfoothefoobarman", words = ["foo","bar"]
输出：[0,9]
解释：因为 words.length == 2 同时 words[i].length == 3，连接的子字符串的长度必须为 6。
子串 "barfoo" 开始位置是 0。它是 words 中以 ["bar","foo"] 顺序排列的连接。
子串 "foobar" 开始位置是 9。它是 words 中以 ["foo","bar"] 顺序排列的连接。
输出顺序无关紧要。返回 [9,0] 也是可以的。

示例 2：

输入：s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]
输出：[]
解释：因为 words.length == 4 并且 words[i].length == 4，所以串联子串的长度必须为 16。
s 中没有子串长度为 16 并且等于 words 的任何顺序排列的连接。
所以我们返回一个空数组。

示例 3：

输入：s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]
输出：[6,9,12]
解释：因为 words.length == 3 并且 words[i].length == 3，所以串联子串的长度必须为 9。
子串 "foobarthe" 开始位置是 6。它是 words 中以 ["foo","bar","the"] 顺序排列的连接。
子串 "barthefoo" 开始位置是 9。它是 words 中以 ["bar","the","foo"] 顺序排列的连接。
子串 "thefoobar" 开始位置是 12。它是 words 中以 ["the","foo","bar"] 顺序排列的连接。

提示：

1 <= s.length <= 104
1 <= words.length <= 5000
1 <= words[i].length <= 30
words[i] 和 s 由小写英文字母组成

解题思路

要解决这个问题，我们可以使用一个滑动窗口的方法。核心思路是维护一个窗口，这个窗口的大小是所有 words 中字符串长度总和。我们逐个移动这个窗口，然后检查窗口内的子串是否是 words 中所有字符串的某种排列。

具体步骤如下：

首先，计算 words 中所有字符串的总长度，这将是我们滑动窗口的大小。
使用一个哈希表 wordCount 来统计 words 数组中每个单词出现的次数。
遍历字符串 s，对于每个可能的窗口起始位置，使用另一个哈希表 seen 来记录当前窗口中出现的单词及其频率。
检查当前窗口内的单词是否与 words 中的单词完全匹配（即 seen 与 wordCount 相等），如果匹配，则将当前窗口的起始索引添加到结果列表中。
最后，返回结果列表。

完整代码

Python

class Solution:
    def findSubstring(self, s: str, words: List[str]) -> List[int]:
        if not words or not s:
            return []

        word_len = len(words[0])
        total_len = word_len * len(words)
        word_count = {}
        result = []

        # 统计 words 中每个单词出现的次数
        for word in words:
            if word in word_count:
                word_count[word] += 1
            else:
                word_count[word] = 1

        # 遍历每个可能的窗口起始位置
        for i in range(word_len):
            left = i
            right = i
            seen = {}

            while right + word_len <= len(s):
                word = s[right:right + word_len]
                right += word_len

                if word in word_count:
                    if word in seen:
                        seen[word] += 1
                    else:
                        seen[word] = 1

                    # 如果当前单词的出现次数超过了它在 words 中的次数，移动左指针
                    while seen[word] > word_count[word]:
                        seen[s[left:left + word_len]] -= 1
                        left += word_len

                    # 如果窗口大小等于总长度，记录起始位置
                    if right - left == total_len:
                        result.append(left)
                else:
                    # 当前单词不在 words 中，重置 seen，移动左指针到 right
                    seen.clear()
                    left = right

        return result

这段代码定义了一个 Solution 类，该类包含一个 findSubstring 方法。这个方法接受一个字符串 s 和一个字符串列表 words 作为输入，返回一个整数列表，表示所有符合条件的起始索引。代码的逻辑与之前讨论的相同，首先计算必要的变量，如每个单词的长度、总长度和单词计数哈希表。然后，遍历字符串 s 的每个可能的窗口起始位置，使用另一个哈希表 seen 来跟踪当前窗口中的单词及其出现次数，并最终确定符合条件的起始索引。

通过

Java

class Solution {
    public List<Integer> findSubstring(String s, String[] words) {
        List<Integer> result = new ArrayList<>();
        if (words == null || words.length == 0 || s == null || s.length() == 0) {
            return result;
        }

        int wordLen = words[0].length();
        int totalLen = wordLen * words.length;
        Map<String, Integer> wordCount = new HashMap<>();

        // 统计words中每个单词出现的次数
        for (String word : words) {
            wordCount.put(word, wordCount.getOrDefault(word, 0) + 1);
        }

        // 遍历每个可能的窗口起始位置
        for (int i = 0; i < wordLen; i++) {
            int left = i, right = i;
            Map<String, Integer> seen = new HashMap<>();

            while (right + wordLen <= s.length()) {
                String word = s.substring(right, right + wordLen);
                right += wordLen;

                if (wordCount.containsKey(word)) {
                    seen.put(word, seen.getOrDefault(word, 0) + 1);

                    // 如果当前单词的出现次数超过了它在words中的次数，移动左指针
                    while (seen.get(word) > wordCount.get(word)) {
                        String leftWord = s.substring(left, left + wordLen);
                        seen.put(leftWord, seen.get(leftWord) - 1);
                        left += wordLen;
                    }

                    // 如果窗口大小等于总长度，记录起始位置
                    if (right - left == totalLen) {
                        result.add(left);
                    }
                } else {
                    // 当前单词不在words中，重置seen，移动左指针到right
                    seen.clear();
                    left = right;
                }
            }
        }

        return result;
    }
}

这个Java实现遵循了之前讨论的逻辑，使用了HashMap来存储单词的计数，以及一个ArrayList来存储结果。遍历字符串s的每个可能的起始位置，逐个检查窗口内的子串是否满足条件，最终返回所有符合条件的起始索引。在Java中处理字符串和集合的方式与Python有所不同，比如使用substring方法来获取子字符串，以及使用Map的getOrDefault方法来简化计数逻辑。

通过