Find Tokenized Words in a String

Leonard Yeo
Level Up Coding
Published in
2 min readAug 16, 2021

--

Find Tokenized Words in a String

Problem

Find tokenized words in a given string, S.

What are Tokenized Words?

  1. Words without space. For example, abc, def, a, b, c
  2. Quoted words with or without space. For example, “qwerty asd zxcv”, “abc”, “a”

Example 01

  • Input: asd def qwe “qwerty asd zxcv” asf “tyuip dfhgdj fgh”
  • Output: [‘asd’, ‘def’, ‘qwe’, ‘“qwerty asd zxcv”’, ‘asf’, ‘“tyuip dfhgdj fgh”’]

Example 02

  • Input: “qwerty asd zxcv”
  • Output: [‘“qwerty asd zxcv”’]

Example 03

  • Input: asd def qwe
  • Output: [‘asd’, ‘def’, ‘qwe’]

Solution

Here’s to my solution to this problem. The entire solution is inspired by the two pointer technique.

Visualization

Finding word with no quotes

The whole idea is to ensure your right pointer lands on a space. That’s would give a good indication that it satisfies as a valid tokenized word.

Finding word with quotes

Same with quoted tokenized word, ensure your left pointer lands on a quote, making sure this is the start quote and moving your right pointer lands to another quote. That’s would give a good indication that it satisfies as a valid quoted tokenized word.

Time and Space complexity

The time complexity for this solution is O(N*N-1) where N is the number of characters in the string.

The space complexity for this solution if we were to include in returning the array would be O(M) where M is the number of valid tokenized words. If we do not include it in the returned array, then the space complexity would be O(1) as we do not use additional auxiliary memory space.

Takeaways

I hope this blog post would be helpful to someone out there who may be struggling to find the solution.

To be honest, this took me a while to find this solution. If anyone has a more optimal solution, do give some feedback so that I am aware of as well. Thank you! Peace! ✌️

--

--