I need to check a string to see if any word in it has multiple occurences. So basically I will accept:
"google makes love"
but I don't accept:
"google makes google love" or "google makes love love google" etc.
Any ideas? Really don't know any way to approach this, any help would be greatly appreciated.
Answers:
I need to check a string to see if any word in it has multiple occurences. So basically I will accept:
"google makes love"
but I don't accept:
"google makes google love" or "google makes love love google" etc.
Any ideas? Really don't know any way to approach this, any help would be greatly appreciated.
Answers:
Based on Wicked Flea code:
function single_use_of_words($str) {
$words = explode(' ', trim($str)); //Trim to prevent any extra blank
if (count(array_unique($words)) == count($words)) {
return true; //Same amount of words
}
return false;
}
Answers:
Try this:
function single_use_of_words($str) {
$words = explode(' ', $str);
$words = array_unique($words);
return implode(' ', $words);
}
Answers:
No need for loops or arrays:
<?php
$needle = 'cat';
$haystack = 'cat in the cat hat';
if ( occursMoreThanOnce($haystack, $needle) ) {
echo 'Success';
}
function occursMoreThanOnce($haystack, $needle) {
return strpos($haystack, $needle) !== strrpos($haystack, $needle);
}
?>
Answers:
<?php
$words = preg_split('\b', $string, PREG_SPLIT_NO_EMPTY);
$wordsUnique = array_unique($words);
if (count($words) != count($wordsUnique)) {
echo 'Duplicate word found!';
}
?>
Answers:
The regular expression way would definitely be my choice.
I did a little test on a string of 320 words with Veynom's function and a regular expression
function preg( $txt ) {
return !preg_match( '/\b(\w+)\b.*?/', $txt );
}
Here's the test
$time['preg'] = microtime( true );
for( $i = 0; $i < 1000; $i++ ) {
preg( $txt );
}
$time['preg'] = microtime( true ) - $time['preg'];
$time['veynom-thewickedflea'] = microtime( true );
for( $i = 0; $i < 1000; $i++ ) {
single_use_of_words( $txt );
}
$time['veynom-thewickedflea'] = microtime( true ) - $time['veynom-thewickedflea'];
print_r( $time );
And here's the result I got
Array
(
[preg] => 0.197616815567
[veynom-thewickedflea] => 0.487532138824
)
Which suggests that the RegExp solution, as well as being a lot more concise is more than twice as fast. ( for a string of 320 words anr 1000 iterations )
When I run the test over 10 000 iterations I get
Array
(
[preg] => 1.51235699654
[veynom-thewickedflea] => 4.99487900734
)
The non RegExp solution also uses a lot more memory.
So.. Regular Expressions for me cos they've got a full tank of gas
EDIT
The text I tested against has duplicate words, If it doesn't, the results may be different. I'll post another set of results.
Update
With the duplicates stripped out ( now 186 words ) the results for 1000 iterations is:
Array
(
[preg] => 0.235826015472
[veynom-thewickedflea] => 0.2528860569
)
About evens
Answers:
function Accept($str)
{
$words = explode(" ", trim($str));
$len = count($words);
for ($i = 0; $i < $len; $i++)
{
for ($p = 0; $p < $len; $p++)
{
if ($p != $i && $words[$i] == $words[$p])
{
return false;
}
}
}
return true;
}
EDIT
Entire test script. Note, when printing "false" php just prints nothing but true is printed as "1".
<?php
function Accept($str)
{
$words = explode(" ", trim($str));
$len = count($words);
for ($i = 0; $i < $len; $i++)
{
for ($p = 0; $p < $len; $p++)
{
if ($p != $i && $words[$i] == $words[$p])
{
return false;
}
}
}
return true;
}
echo Accept("google makes love"), ", ", Accept("google makes google love"), ", ",
Accept("google makes love love google"), ", ", Accept("babe health insurance babe");
?>
Prints the correct output:
1, , ,
Answers:
This seems fairly fast. It would be interesting to see (for all the answers) how the memory usage and time taken increase as you increase the length of the input string.
function check($str) {
//remove double spaces
$c = 1;
while ($c) $str = str_replace(' ', ' ', $str, $c);
//split into array of words
$words = explode(' ', $str);
foreach ($words as $key => $word) {
//remove current word from array
unset($words[$key]);
//if it still exists in the array it must be duplicated
if (in_array($word, $words)) {
return false;
}
}
return true;
}
Edit
Fixed issue with multiple spaces. I'm not sure whether it is better to remove these at the start (as I have) or check each word is non-empty in the foreach.
Answers:
The simplest method is to loop through each word and check against all previous words for duplicates.
Answers:
Regular expression with backreferencing