Monday, June 28, 2010

mysql - find grandparents/grandchildren

Another interesting set of queries are parent, grandparent queries.

To find all the grand children nodes do this.
select b.id, b.parent_id, a.parent_id from kids as a, kids as b where a.id = b.parent_id and a.parent_id is not null;

To find all the grand parents with more than 2 grand children do this.
select b.id, b.parent_id, a.parent_id from kids as a, kids as b where a.id = b.parent_id and a.parent_id is not null group by a.parent_id having count(a.parent_id) > 2;

To produce the full family tree of a grandchild node do this
select b.id, b.parent_id, a.parent_id from kids as a right outer join kids as b on a.id = b.parent_id;

To find all nodes whos grandparent(and possibly parent) is unknown do this;
select b.id, b.parent_id, a.parent_id from kids as a right outer join kids as b on a.id = b.parent_id where a.parent_id is null;

Here is my full test sql.. try it out
create table kids (id integer, parent_id integer);
describe kids;
insert into kids values (1,null);
insert into kids values (3,1);
insert into kids values (4,2);
insert into kids values (5,2);
insert into kids values (6,4);
insert into kids values (7,2);
insert into kids values (8,7);
insert into kids values (9,7);
insert into kids values (10,7);
insert into kids values (11,3);
select * from kids;
select b.id, b.parent_id, a.parent_id from kids as a, kids as b where a.id = b.parent_id and a.parent_id is not null;

select b.id, b.parent_id, a.parent_id from kids as a, kids as b where a.id = b.parent_id and a.parent_id is not null group by a.parent_id having count(a.parent_id) > 2;

select b.id, b.parent_id, a.parent_id from kids as a right outer join kids as b on a.id = b.parent_id;

select b.id, b.parent_id, a.parent_id from kids as a right outer join kids as b on a.id = b.parent_id where a.parent_id is null;

Saturday, June 26, 2010

My sql -- finding duplicates in a column

Here is how to find all entries that have duplicates in a mysql database


select email, count(email) from emails group by email having count(email) > 1;

So what is happening. To me it appears that the SQL boys have decided that selects can act in two independent ways:
1) "normal mode" - give me all the results
2) "aggregate mode" - give me groups of results represented by the first group member

It appears to me that this "aggregate mode" is triggered if either the "group by" or an aggregate function is present in the statement.

To be more specific;
1) When the "GROUP BY" is not present but an aggregate function is used then it is assumed that the data is one large group and only a single row is returned. This row appears to have the data from the first row of the normal select and the results of any aggregate function applied access the entire normal select results.
2) However if the "GROUP BY" is present then the functions operate on the individual groups. The resulting row(s) will use the normal selects first row for the normal data in the group and will apply any functions across the entire group.

This statement here will demonstrate the second rule in action more clearly.
select email, count(email), num, sum(num), avg(num), min(num), max(num) from emails group by email;

Here is the full test sql which brought me to my conclusions;
create table emails (email text, num integer);
describe emails;
insert into emails values("a@a.com", 1);
insert into emails values("b@a.com", 2);
insert into emails values("b@a.com", 3);
insert into emails values("c@a.com", 4);
insert into emails values("d@a.com", 5);
insert into emails values("d@a.com", 6);
insert into emails values("d@a.com", 7);
insert into emails values("b@a.com", 8);
select * from emails;
select distinct email from emails;
select email from emails group by email;
select email, count(email) from emails;
select email, count(email), num, sum(num), avg(num), min(num), max(num) from emails group by email;
select email, count(email) from emails group by email having count(email) > 1;

There are several functions that operate on the "group"
sum(), count(), avg(), min(), max(), etc

Refer here for more info;
http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html

Mysql grant syntax

As unbelievable as it is I forget the grant syntax way to easily...

GRANT ALL ON database.* TO username@'localhost' IDENTIFIED BY 'password';

die hard 3 jug 5 jug problem

Question:
You have two jugs a 3 and 5 liter jug how do you get 4 liters in a jug.

Answer:
This is one of the more basic questions, but it still appears in job interviews. It is basically impossible not to get the answer correct.

1) First fill the 5 liter jug.
2) Poor water from the 5 liter into the 3 liter one until its full. That leaves 2 liters in the 5 liter,
3) Empty the 3 liter and transfer the 2 liters into it. This leaves 1 liter of space in the 3 liter jug.
4) Then fill the 5 liter jug
5) Then poor 1 liter out of the 5 liter into the 1 liter of empty space in the 3 liter jug.

Done the 5 liter jug has 4 liters in it

c++: Basic Traits

Traits:
In C++ "traits" are basically a group of template classes that provide miscellaneous information about another type or data structure.

'Think of a trait as a small object whose main purpose is to carry information used by another object or algorithm to determine "policy" or "implementation details".'
- Bjarne Stroustrup

Traits are very common in C++. One of the more common ones is the string class its self. The string class uses traits to provide information about internationalization and character encoding. Often they are visible in the code and debugger output as a template parameter that has a default initialization via a second template.

In other cases the use of traits is more normal as in the the class "std::numeric_limits" and the other members of the limits.h header.

Boost also offers many interesting and helpful traits classes as an example here is the "is_void" traits class.

template< typename T > 
struct is_void{ 
  static const bool value = false;
};

template<> 
struct is_void< void >{ 
  static const bool value = true; 
};


As you can see this trait template's sole purpose to is provide information about what the input parameter type was. This is only really useful in a parts of the code which the programmer doesn't know what the input type was: Hence inside another template.

For more info refer to:
http://www.cantrip.org/traits.html
http://www.cplusplus.com/reference/std/limits/numeric_limits/

The truth about what movitates people

http://www.youtube.com/watch?v=u6XAPnuFjJc

Wednesday, June 23, 2010

Three coworkers sharing average salaries

Question:
Three coworkers would like to know their average salary. How can they do it, without disclosing their own salaries?

The key is to scramble the sum with unknown/unshared data. The scramble must be reversible and when reversing it must not effect the sum.

Answer on the web:
So each person adds a random number(Rn) plus there salary(Sn) to the sum(Sum) and hands it to the next one. And then in the second round they each deduce there random number

(round 1)
Sum0 = S1 + R1
Sum1 = S1 + S2 + R1 + R2
Sum2 = S1 + S2 + S3 + R1 + R2 + R3

(round 2)
Sum3 = S1 + S2 + S3 + R2 + R3
Sum4 = S1 + S2 + S3 + R3
Sum5 = S1 + S2 + S3

And final they divide by the number of people
Average(6) = (S1 + S2 + S3)/3

Here is the proof:
For person 1
Starts with:
R1
S1
Sum0 = S1 + R1
Sum2 = S1 + S2 + S3 + R1 + R2 + R3
Sum3 = S1 + S2 + S3 + R2 + R3
Sum5 = S1 + S2 + S3

Can calc
R2 + R3 = Sum3 - Sum5

So he finally knows
R1
S1
R2 + R3
S2 + S3 

For person 2
He starts with:
R2
S2
Sum0 = S1 + R1
Sum1 = S1 + S2 + R1 + R2
Sum3 = S1 + S2 + S3 + R2 + R3
Sum4 = S1 + S2 + S3 + R3
Sum5 = S1 + S2 + S3

He can calc:
R3 = Sum4 - Sum5

He finally knows knows
R2
R3
S2
S1 + R1
S1 + S3

For person 3
He Starts with:
R3
S3
Sum1 = S1 + S2 + R1 + R2
Sum2 = S1 + S2 + S3 + R1 + R2 + R3
Sum4 = S1 + S2 + S3 + R3
Sum5 = S1 + S2 + S3

He can calc:
S1 + S2 = Sum5 - S2
R1 + R2 = Sum2 - S1 + S2 

He finally knows:
R3
S3
S1 + S2
R1 + R2

Here is a much stronger answer
It requires an 2 or more arbitrators in the system, At the end of each Sum stage the summer chooses a random arbitrator who then adds his own random value to the sum.
Once Sum5 stage is complete the arbitrators each in turn recieve the sum and deduct out the total of the random numbers they added in. As a result the computation becomes

(round 1)
Sum0 = S1 + R1 + A0
Sum1 = S1 + S2 + R1 + R2 + A0 + A1
Sum2 = S1 + S2 + S3 + R1 + R2 + R3 + A0 + A1 + A2

(round 2)
Sum3 = S1 + S2 + S3 + R2 + R3 + A0 + A1 + A2 + A3
Sum4 = S1 + S2 + S3 + R3 + A0 + A1 + A2 + A3 + A4
Sum5 = S1 + S2 + S3 + A0 + A1 + A2 + A3 + A4 + A5

Sum6 =  S1 + S2 + S3 + A0 + A2 + A4
Sum7 =  S1 + S2 + S3

Hence its much more harder to break

Nuggets in packs of 6, 9 and 20.

Question:
At a fast food restaurant you can buy chicken nuggets in packs of 6, 9 or 20. Is there such a number N, that for all numbers greater than it, it is possible to buy any number of chicken nuggets?

Answer:
Keep in mind the key fact we need to prove that numbers are not possible. The equation we have is 6x + 9y + 20z = n. The minimum step we can take is 6 so lets work from there.
If we assume y and z are 0 then the equation becomes 6x = n. Hence the possible numbers become

 1  2  3  4  5  x
 7  8  9 10 11  x
13 14 15 16 17  x 
19 20 21 22 23  x 

If we assume y=1, z=0 then the equation becomes 9 + 6x = n. Hence the possible numbers become

 1  2  3  4  5  x
 7  8  x 10 11  x
13 14  x 16 17  x 
19 20  x 22 23  x 

Any further 9s will take us back to the same used columns so more 9s is pointless that leaves 20. If we assume y=0, z=1 then the equation becomes 20 + 6x = n

 1  2  3  4  5  x
 7  8  x 10 11  x
13 14  x 16 17  x
19  x  x 22 23  x
25  x  x 28 29  x
31  x  x 34 35  x
37  x  x 40 41  x
43  x  x 46 47  x

If we assume y=1, z=1 then the equation becomes 20 + 9 + 6x = 29 + 6n = n

 1  2  3  4  5  x
 7  8  x 10 11  x
13 14  x 16 17  x
19  x  x 22 23  x
25  x  x 28  x  x
31  x  x 34  x  x
37  x  x 40  x  x
43  x  x 46  x  x

If we go y=2,z=1 now then it becomes 38 + 6n which is already taken. So we try another 20 If we assume y=0, z=2 then the equation becomes 40 + 6x = n

 1  2  3  4  5  x
 7  8  x 10 11  x
13 14  x 16 17  x
19  x  x 22 23  x
25  x  x 28  x  x
31  x  x 34  x  x
37  x  x  x  x  x
43  x  x  x  x  x
 x  x  x  x  x  x

If we assume y=1, z=2 then the equation becomes 40 + 9 + 6x = 49 + 6x = n

Hence N = 43.

fork: basic child processes - creating, waiting for and ending children

Here is basic example program demonstrating the use of forks and child processes.

The key points are;
  • Children processes are created with a call to fork, Fork returns:
    • A number less than 0(the Error code) on an error.
    • A number equal to 0 in the thread.
    • A number greater than to 0(the childs pid) in the main process.
  • A call to getpid() will return the pid of the child or parent process.

  • Keep in might the differences between a thread and process. The fundamental difference is that a thread shares all regions of memory except the stack where as a process shares none. In general memory is copied over the the child process as needed by the kernel.

//complie with; g++ fork.cpp
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/wait.h>
#include <time.h>

struct BaseA
{
 pid_t pid;
 int number;
};

//The thread core...
void child_worker(BaseA* data)
{
 data->pid = getpid();

 for(int i = 0; i < 10; i++)
   {
     printf("Child %d @ %d \n", data->pid, data->number++);
     sleep(rand()%3000/1000);
   }

 printf("Child %d DONE!\n", data->pid);
 exit(0);
}

int main()
{
 srand(time(NULL));

 pid_t childPid;
 BaseA data;

 //setup the process
 data.number = 1;

 // create the threads
 childPid = fork();
 if (childPid >= 0) // fork ok
    if (childPid == 0) // Am I the kid
       child_worker(&data);
 childPid = fork();
 if (childPid >= 0) // fork ok
    if (childPid == 0) // Am I the kid
       child_worker(&data);

 //wait for all kids to complete
 wait(NULL);
 wait(NULL);

 printf("Parent %d DONE!\n", data.number);

 exit(0);
}

Tuesday, June 22, 2010

pthread: basic threading - creating, waiting for and ending threads

Here is basic example program demonstrating threads using the posfix threads library.

The key points are;
  • Threads are created with a call to pthread_create, the parameters are:
    • The threads handle, consider it to be the equivalent of a file handle.
    • The threads attributes. Normally the defaults are sufficient so use NULL, otherwise you might end up having to allocate the stack for it too.
    • The function pointer to the threads "main", if this routine exits or returns it will implicitly end the thread just like the pthread_exit call.
    • And a generic chunk of memory that is passed to the threads "main".
  • pthread_join; this takes the threads handle and simply blocks until it gets the signal that the thread has ended. The second parameter is a pointer to the location of where to store the pointer for the threads exit.. gezz look at the function definition its more easy to understand than a written description of it.
  • pthread_exit: This is the thread equivalent of exit but it can return a pointer to a complex piece of data.

And here is the sample code its really simple and just a brush up on threads for now
//complie with; g++ thread.cpp -lpthread
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <pthread.h>

struct BaseA
{
 pthread_t handler; //the handle to the thread.. a little dangerous!
 int number;
};

//The thread core...
void* thread_worker( void *ptr )
{
 BaseA *data = (BaseA*)ptr;

 for(int i = 0; i < 10; i++)
   {
     printf("Thread %d @ %d \n", data->number, i);
     sleep(rand()%3000/1000);
   }

 printf("Thread %d DONE!\n", data->number);
 pthread_exit(NULL);
}

int main()
{
 srand(time(NULL));

 BaseA data[2];

 //setup the threads
 data[0].number = 1;
 data[1].number = 2;

 // create the threads
 pthread_create (&data[0].handler, NULL, &thread_worker, (void *) &data[0]);
 pthread_create (&data[1].handler, NULL, &thread_worker, (void *) &data[1]);

 //wait for all kids to complete
 pthread_join(data[0].handler, NULL);
 pthread_join(data[1].handler, NULL);

 exit(0);
}

strace - Debugging a live misbehaving process.

Often when a server is live you dont have the luxury of installing a debug-able version and stepping through its code when the problem appears. Other times you need to get a hold on whatever is killing a program that you never even wrote.

Enter "strace" this tool watches the system calls of a process and dumps it to your console for analysis. strace is not for the faint at heart. You need to know about how shells operate, how dynamic linked libraries are loaded and based and how kernel system calls pass in and out of system. On the flip side if you have never seen it before its a real eye opener. The average programmer most likely just doesn't realize how involved the kernel is with your little program.

To execute it on a program from scratch
strace -v <filename>
Note the "-v" gives you information about the environment variables that are passed from the shell to the new process.

To attach to a live process (you'll need to match the processes privilege level or better)
strace -p <process_id>

Here are some more resources on it:
http://www.redhat.com/magazine/010aug05/features/strace/
http://linux.die.net/man/1/strace

ok and here is an strace of ls in an empty directory so you can clearly see that simple isnt really that simple! (PS I seem to have a problem with my locales is polluting the result)
~/tmp> strace ls ./

execve("/bin/ls", ["ls", "./"], [/* 42 vars */]) = 0
brk(0)                                  = 0x61a000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7da0f51000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7da0f4f000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/home/ashley/obj-mail-release-NDEBUG/dist/bin/tls/x86_64/librt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat("/home/ashley/obj-mail-release-NDEBUG/dist/bin/tls/x86_64", 0x7fff25ef42d0) = -1 ENOENT (No such file or directory)
open("/home/ashley/obj-mail-release-NDEBUG/dist/bin/tls/librt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat("/home/ashley/obj-mail-release-NDEBUG/dist/bin/tls", 0x7fff25ef42d0) = -1 ENOENT (No such file or directory)
open("/home/ashley/obj-mail-release-NDEBUG/dist/bin/x86_64/librt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat("/home/ashley/obj-mail-release-NDEBUG/dist/bin/x86_64", 0x7fff25ef42d0) = -1 ENOENT (No such file or directory)
open("/home/ashley/obj-mail-release-NDEBUG/dist/bin/librt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat("/home/ashley/obj-mail-release-NDEBUG/dist/bin", 0x7fff25ef42d0) = -1 ENOENT (No such file or directory)
open("/home/ashley/obj-mail-debug/dist/lib/tls/x86_64/librt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat("/home/ashley/obj-mail-debug/dist/lib/tls/x86_64", 0x7fff25ef42d0) = -1 ENOENT (No such file or directory)
open("/home/ashley/obj-mail-debug/dist/lib/tls/librt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat("/home/ashley/obj-mail-debug/dist/lib/tls", 0x7fff25ef42d0) = -1 ENOENT (No such file or directory)
open("/home/ashley/obj-mail-debug/dist/lib/x86_64/librt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat("/home/ashley/obj-mail-debug/dist/lib/x86_64", 0x7fff25ef42d0) = -1 ENOENT (No such file or directory)
open("/home/ashley/obj-mail-debug/dist/lib/librt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
stat("/home/ashley/obj-mail-debug/dist/lib", 0x7fff25ef42d0) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=102124, ...}) = 0
mmap(NULL, 102124, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7da0f36000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/librt.so.1", O_RDONLY)       = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240#\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=35784, ...}) = 0
mmap(NULL, 2132968, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7da0b2c000
mprotect(0x7f7da0b34000, 2093056, PROT_NONE) = 0
mmap(0x7f7da0d33000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x7000) = 0x7f7da0d33000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/libselinux.so.1", O_RDONLY)  = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240Q\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=109368, ...}) = 0
mmap(NULL, 2209176, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7da0910000
mprotect(0x7f7da0929000, 2097152, PROT_NONE) = 0
mmap(0x7f7da0b29000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x19000) = 0x7f7da0b29000
mmap(0x7f7da0b2b000, 1432, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f7da0b2b000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/libacl.so.1", O_RDONLY)      = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220\33\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=27600, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7da0f35000
mmap(NULL, 2122744, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7da0709000
mprotect(0x7f7da070f000, 2097152, PROT_NONE) = 0
mmap(0x7f7da090f000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x6000) = 0x7f7da090f000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/libc.so.6", O_RDONLY)        = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340\342"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1436976, ...}) = 0
mmap(NULL, 3543672, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7da03a7000
mprotect(0x7f7da04ff000, 2097152, PROT_NONE) = 0
mmap(0x7f7da06ff000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x158000) = 0x7f7da06ff000
mmap(0x7f7da0704000, 17016, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f7da0704000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/libpthread.so.0", O_RDONLY)  = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260W\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=130224, ...}) = 0
mmap(NULL, 2208624, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7da018b000
mprotect(0x7f7da01a1000, 2097152, PROT_NONE) = 0
mmap(0x7f7da03a1000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16000) = 0x7f7da03a1000
mmap(0x7f7da03a3000, 13168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f7da03a3000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/libdl.so.2", O_RDONLY)       = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 \16\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=14624, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7da0f34000
mmap(NULL, 2109728, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7d9ff87000
mprotect(0x7f7d9ff89000, 2097152, PROT_NONE) = 0
mmap(0x7f7da0189000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7f7da0189000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/libattr.so.1", O_RDONLY)     = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000\21\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=16128, ...}) = 0
mmap(NULL, 2111240, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7d9fd83000
mprotect(0x7f7d9fd87000, 2093056, PROT_NONE) = 0
mmap(0x7f7d9ff86000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3000) = 0x7f7d9ff86000
close(3)                                = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7da0f33000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7da0f32000
arch_prctl(ARCH_SET_FS, 0x7f7da0f32780) = 0
mprotect(0x7f7da06ff000, 12288, PROT_READ) = 0
munmap(0x7f7da0f36000, 102124)          = 0
set_tid_address(0x7f7da0f32810)         = 15691
set_robust_list(0x7f7da0f32820, 0x18)   = 0
futex(0x7fff25ef4e0c, 0x81 /* FUTEX_??? */, 1) = 0
rt_sigaction(SIGRTMIN, {0x7f7da01902d0, [], SA_RESTORER|SA_SIGINFO, 0x7f7da01997d0}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {0x7f7da0190350, [], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x7f7da01997d0}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
brk(0)                                  = 0x61a000
brk(0x63b000)                           = 0x63b000
open("/etc/selinux/config", O_RDONLY)   = -1 ENOENT (No such file or directory)
statfs("/selinux", 0x7fff25ef3d30)      = -1 ENOENT (No such file or directory)
open("/proc/mounts", O_RDONLY)          = 3
fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7da0f4e000
read(3, "rootfs / rootfs rw 0 0\nnone /sys"..., 1024) = 1024
read(3, "server1/public /home/ashley/file"..., 1024) = 537
read(3, "", 1024)                       = 0
close(3)                                = 0
munmap(0x7f7da0f4e000, 4096)            = 0
open("/usr/lib/locale/locale-archive", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/locale.alias", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=2586, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7da0f4e000
read(3, "# Locale name alias data base.\n#"..., 4096) = 2586
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0x7f7da0f4e000, 4096)            = 0
open("/usr/lib/locale/en_US.UTF-8/LC_IDENTIFICATION", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en_US.utf8/LC_IDENTIFICATION", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=373, ...}) = 0
mmap(NULL, 373, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7da0f4e000
close(3)                                = 0
open("/usr/lib/gconv/gconv-modules.cache", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=25700, ...}) = 0
mmap(NULL, 25700, PROT_READ, MAP_SHARED, 3, 0) = 0x7f7da0f47000
close(3)                                = 0
futex(0x7f7da0703f40, 0x81 /* FUTEX_??? */, 2147483647) = 0
open("/usr/lib/locale/en_US.UTF-8/LC_MEASUREMENT", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en_US.utf8/LC_MEASUREMENT", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=23, ...}) = 0
mmap(NULL, 23, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7da0f46000
close(3)                                = 0
open("/usr/lib/locale/en_US.UTF-8/LC_TELEPHONE", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en_US.utf8/LC_TELEPHONE", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=59, ...}) = 0
mmap(NULL, 59, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7da0f45000
close(3)                                = 0
open("/usr/lib/locale/en_US.UTF-8/LC_ADDRESS", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en_US.utf8/LC_ADDRESS", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=155, ...}) = 0
mmap(NULL, 155, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7da0f44000
close(3)                                = 0
open("/usr/lib/locale/en_US.UTF-8/LC_NAME", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en_US.utf8/LC_NAME", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=77, ...}) = 0
mmap(NULL, 77, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7da0f43000
close(3)                                = 0
open("/usr/lib/locale/en_US.UTF-8/LC_PAPER", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en_US.utf8/LC_PAPER", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=34, ...}) = 0
mmap(NULL, 34, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7da0f42000
close(3)                                = 0
open("/usr/lib/locale/en_US.UTF-8/LC_MESSAGES", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en_US.utf8/LC_MESSAGES", O_RDONLY) = 3
fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
close(3)                                = 0
open("/usr/lib/locale/en_US.utf8/LC_MESSAGES/SYS_LC_MESSAGES", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=52, ...}) = 0
mmap(NULL, 52, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7da0f41000
close(3)                                = 0
open("/usr/lib/locale/en_US.UTF-8/LC_MONETARY", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en_US.utf8/LC_MONETARY", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=286, ...}) = 0
mmap(NULL, 286, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7da0f40000
close(3)                                = 0
open("/usr/lib/locale/en_US.UTF-8/LC_COLLATE", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en_US.utf8/LC_COLLATE", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=921214, ...}) = 0
mmap(NULL, 921214, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7da0e51000
close(3)                                = 0
open("/usr/lib/locale/en_US.UTF-8/LC_TIME", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en_US.utf8/LC_TIME", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=2451, ...}) = 0
mmap(NULL, 2451, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7da0f3f000
close(3)                                = 0
open("/usr/lib/locale/en_US.UTF-8/LC_NUMERIC", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en_US.utf8/LC_NUMERIC", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=54, ...}) = 0
mmap(NULL, 54, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7da0f3e000
close(3)                                = 0
open("/usr/lib/locale/en_US.UTF-8/LC_CTYPE", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en_US.utf8/LC_CTYPE", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=254076, ...}) = 0
mmap(NULL, 254076, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7da0e12000
close(3)                                = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(1, TIOCGWINSZ, {ws_row=46, ws_col=156, ws_xpixel=0, ws_ypixel=0}) = 0
stat("./", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
open("./", O_RDONLY|O_NONBLOCK|O_DIRECTORY|0x80000) = 3
fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
fcntl(3, F_GETFD)                       = 0x1 (flags FD_CLOEXEC)
getdents(3, /* 2 entries */, 4096)      = 48
getdents(3, /* 0 entries */, 4096)      = 0
close(3)                                = 0
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
Process 15691 detached

ldd - tracking used libraries/versions

To figure out what libraries(and its location/version as well) an executable is using hit it with "ldd". When using it know in mind that it needs the exact path to the executable.

For example to track down what "ls" is using try this
~> ldd /bin/ls
       linux-vdso.so.1 =>  (0x00007fffdfad4000)
       librt.so.1 => /lib/librt.so.1 (0x00007f33ad7a1000)
       libselinux.so.1 => /lib/libselinux.so.1 (0x00007f33ad585000)
       libacl.so.1 => /lib/libacl.so.1 (0x00007f33ad37e000)
       libc.so.6 => /lib/libc.so.6 (0x00007f33ad01c000)
       libpthread.so.0 => /lib/libpthread.so.0 (0x00007f33ace00000)
       /lib64/ld-linux-x86-64.so.2 (0x00007f33ad9aa000)
       libdl.so.2 => /lib/libdl.so.2 (0x00007f33acbfc000)
       libattr.so.1 => /lib/libattr.so.1 (0x00007f33ac9f8000)

Sunday, June 20, 2010

Explain a database

Question
Explain a database in three sentences to your eight-year-old nephew.

Answer:
A database is like a book shelf. You look at all the spines and pick out the book you want to read.

2 eggs and a 100-story building

Question:
You are given 2 eggs. You have access to a 100-story building. Eggs can be very hard or very fragile means it may break if dropped from the first floor or may not even break if dropped from 100th floor. Both eggs are identical. You need to figure out the highest floor of a 100-story building an egg can be dropped without breaking. The question is how many drops you need to make in the worst case. You are allowed to break 2 eggs in the process.

Answer:
You can brute force it by starting at 1 and dropping your first egg until it breaks. This takes 100 drops, 1 egg and a good set of lungs... and maybe a defibrillator...

Since we have 2 eggs we can cut it down by wasting 1 egg first.
So divide and conquer, if we start half way and drop 1 egg. Then if it breaks we have to start at the bottom. However if it didn’t break then we can go up to the 75th floor and try agian. repeating this until we break the egg and then going back to the prior floor that was successful and start dropping the second egg until it breaks. This cuts down the number of drops to less than 50.

However as you can see the problem is when the first egg breaks we are back to the brute force method.

So we need to take it slower with the first egg. So lets start a the 10th floor. we drop the egg and if it doesn't break we go to the 20th floor and repeat. If it breaks we go to the 1st floor and brute force it until the 9th floor. Now consider what happens when the egg brakes on the 99th floor we needed to drop the first egg from the 10th-100th in steps of 10. That's 10 drops. Then we need to drop the second egg 91-99 that s 9 more times for a total of 19 drops in the worst case.

However this worst case happens once. Isnt there a way to make the worst case the same for all combination's and maybe reduce the overall worst case.

So with each drop of the first egg the number of drops we should make for the worst case with the second egg decreases by 1. That way the worst case count is constant. So to figure this out we start from the END at the 100th floor.. that drop will tell us if its 99 or 100 with 1 drop of the second egg at 99 to tell them apart. So the prior drop for the first egg should be at the 98th floor and there would be 2 drops of the second egg after it. Get the picture? So the sequence of eggs drops first then second drops work like this;

First egg(Sequence in reverse) => Second sequence egg if the broke

100 => 99
98 => 96,97
95 => 92,93,94
91 => 87,88,89,90
86 => 81,82,83,84,85
80 => 74,75,76,77,78,79
73 => 66,67,68,69,70,71,72
65 => 57,58,59,60,61,62,63,64
56 => 47,48,49,50,51,52,53,54,55
46 => 36,37,38,39,40,41,42,43,44,45
35 => 24,25,26,27,28,29,30,31,32,33,34
23 => 11,12,13,14,15,16,17,18,19,20,21,22
10 => 1,2,3,4,5,6,7,8,9

And the worst case is now 14

Interview tip - Answering Tech Questions

In interviews everyone wants to impress. So when a question of technical nature comes I have the tendency to find and present the best solution to the interviewer. I believe this to be incorrect.

Heres why;
  • If you mess up the solution up from nerves or pressure. Then what can the interviewer do but mark you down for it.
  • The best solution often takes time to consider and the interviewer is doing what in that time? While you sit around thinking what is the interview thinking about your performance?
  • Often if things get to slow the interviewer will start to interfere with you to speed it along. He might offer tips to you, tips that you may of may not have needed, either way he is again likely to take marks off you for having to help you.

So here is an alternate approach, its inspired in part by the the old social engineering tricks made famous by hackers. Consider for a moment what the interviewer is doing, he has a problem and he knows the solution, he also probably knows some tips or key info about the solution that can help you. SO if he isnt talking to you there is no way he can tell you them by accident.

First of all start at brute force. Call it that so the interviewer knows you know better. Talk about it and its negative points and get the interviewer involved get him talking with you about it, make him think that your his fellow peer and you two are talking about a problem at work. Then propose a fix on one of the negative points and slowly change your answer into your ideal one. This gets the interviewer use to the idea that your solution isnt final and its just a discussion point.

Pay close attention to the interviewer, if you propose a fix and the interview was to say "yes but that cant insert with 0(1) complexity" then bingo he might want a data structure like a hash. So again point out a negative of your new solution and suggest a hash and do it gradually so he doesn't catch on that what he said was your key.

Sometimes the interviewer will stay tight lipped in the case of a tight lipped interviewer run down your list of ideas on how to optimize things. Here is mine generally one.

  • Divide and Conquer:
    • Look at each part of the solution and see if there is any dependence between the parts how is it dependent and can it be broken apart?
    • Start looking at any abnormalities in the computation, if its is excluded does the computation become simpler, can the exception to the rule be added back in in the last moment?
    • Take an array or LUT and divide it in half whats the logic to merge the parts back together after the division?
    • For example take an array and sum all elements but the current element only, then cant you sum the left normally and then the right normally and then add the two sums on either side together of the current point together?
  • Inductive repeating computations
    • Examine the nature of computation especially ones that are repeated heavily or other pieces of data, reduce it to the induction formula and see if the partial results can be shared out some how.
    • For example sums are an accumulation, and that reduces into a mathematics induction formula; ie Sum(x) = Sum(x-1) + f(x)

Some worked examples:
Here are posts in which i work out a solution to the question with the above rules;
WILL EDIT AND ADD THEM LATER

Saturday, June 19, 2010

Interview tip - Company rejections.

If a Company rejects you then you have several choices;
1) Get mad and do something stupid.
2) Cry about it.
3) Be pro-activate and correct the faults, and then keep doing what you love.

I was just recently rejected from a large company that I really wanted to work for and after initial plans to hurt then in a stupid and childish way I calmed down talked it out with a friend, finally I decided to take his advice and follow the better road:

Be Pro-activate look back at the interview contents and figure out why they didn't hire you. Repeat the questions and deeply research the answers. Write it all up so that you can read it in six months and know what happened.

Then contact the recruiter(s)/interviewer(s) of the company that didnt hire you and network with him, ask him if he is interested and send him on yourself own self-review of the interview. Then in a few months pull out dust it off re-read it all do the interview again yourself and THEN reapply for another job at that same company, start with the people how accepted your networking request.

Here is Steve jobs talking about being proactive after being fired
http://www.ted.com/talks/steve_jobs_how_to_live_before_you_die.html

To bet or not to bet:: A Pairs card game

Question
Your friend offers to play a card game with you using a normal deck of 52 cards.  The rules of the game are that you'll turn over two cards at a time.  If the cards are both black, they go into his pile.  if they are both red, they go into your pile.  If there is one red and one black, they go into the discard pile.

You repeat the two card flipping until all 52 cards are used.  Whoever has more cards in their pile at the end wins.  But your friend also wins if there is a tie.  If you win, you get a dollar.  How much would bet on the game?

Answer 1)
First of all its a trick if you get one pair of reds then 1 pair of blacks remain extra in the deck once all the cards are draw the game will always draw and your friend will always win.

That wasn't interesting. SO...

Answer 2)
Assuming that we use a casino deck with a random mix of cards and we draw a random number of cards..

First consider the risk analysis;
P(Win)*Reward - P(Lose)*Bet > 0
Hence: Bet < Reward*P(Win)/P(lose)


Now the odds:
Lets simplify it we divide the 52 cards into 26 pairs. Each pair has four possibles BB, BR, RB, RR. The pairs all have the same chance of occurring.

So in terms of the score
+1 has 25%
0 has 50%
-1 has 25%

Lets start small;
If there is only 1 pair
then you have 25% chance of wining
and 50% + 25% = 75% chance of losing.

Since Bet < Reward*P(Win)/P(lose)
Bet < 1* (1/4)/(3/4)
     < 1/3

Hence you need to bet less than 33 cents against a dollar to come out on top.

As you can see above the only number that really counts is the probability of 0. Since the remaining chance splits evenly between you and your friend

So if there are 2 pairs. the possible 0 end scores become
0,0  => 1 way at 0.5^2
+1,-1 => 2 ways at 0.25^2

The permutations are; +-, -+

Total = 1/2^2 + 2*1/4^2
= 1/4 + 2/16
= 4/16 + 2/16
= 6/16
The remaining chance 10/16 splits evenly resulting in;
Chance to win 5/16
Chance to lose 11/16

Since Bet < Reward*P(Win)/P(lose)
Bet < 1* (5/16)/(11/16)
     < 5/11
     < 0.45
ie bet less than 45 cents to come out on top..

if there are 3 pairs then
000 => 1 way 0.5^3
+0- => nCr(3,1)=3 possible places for +1, leaving nCr(2,1)=2 places for -1
=> hence there are 3*2=6 ways total

They are:
+-0, +0-
0+-, -+0
-0+, 0-+

Total = 1 * 0.5^3 + 6 * 0.25^2 * 0.5
= 1/8 + 6/32
= 1*4/32 + 6*1/32
= 10/32

That leaves 22/32 which splits equally between you and your friend..
Chance to win 11/32
Chance to lose 21/32

Since Bet < Reward*P(Win)/P(lose)
Bet < 1* (11/32)/(21/32)
     < 11/21
     < 0.52
ie bet less than 52 cents to come out on top..


If there are 4 pairs
0000 => nCr(4,0)=1 arrangements of +, and nCr(4,4)=1 for 0s giving a total of  1 way 0.5^3
+00- => nCr(4,1)=4 possible arrangements for the +, leaving nCr(3,2)=3 arrangements for the 0
=> hence there are 4*3 = 12 ways total at 0.5^2 * 0.25^2

+00-, +0-0 ,+-00
0+0-, 0+-0, -+00
00+-, 0-+0, -0+0
00-+, 0-0+, -00+

++-- => nCr(4,2)=6 possible arrangements for +, leaving nCr(3,0)=1 arrangements for the 0
=> hence there are 6*1 ways total at 0.25^4

++--, +-+-, +--+, -++-,-+-+, --++

Total = (1 * 0.5^4) + (12 * 0.5^2*0.25^2) + (6 * 0.25^4)
= 1/16 + 12/(16*4) + 6/256
= 1*16/256 + 12*4/256 + 6*1/256
= (16+48+6)/256
= 70/256

That leaves 186/256 which splits equally between you and your friend..
Chance to win 93/256
Chance to lose 163/256

Since Bet < Reward*P(Win)/P(lose)
Bet < (93/256)/(163/256)
< 93/163
< 0.57
ie bet less than 57 cents to come out on top..

Ok so now lets generalize it. for the number of pairs n the probability of 0;
Denominator is always 4^n
Numerator is always Sum( 4^(n-2i) * nCr(n,i) * nCr(n-i, n-2i ) ) where i = floor(n/2) ... 0

Thus
P(0) = Sum(...) / 4^n

Therefore the
Chance to win = (1 - P(0) ) /2
Chance to lose = (1 - P(0) ) /2 + P(0)
= (1 + P(0) ) /2

Since Bet < Reward*P(Win)/P(lose)
Bet <  (1 - P(0) ) /2 / (1 + P(0) ) /2
<  (1 - P(0) ) / (1 + P(0) )
< (4^n - Sum(...) ) / (4^n + Sum(...))

Hence with the 26 pairs case the p(0) becomes 0.11 and the bet increases to < 0.80  cents


pairs p(0) Bet
1 0.5 0.333
2 0.375 0.4545
3 0.3125 0.52381
4 0.273438 0.570552
5 0.246094 0.605016
6 0.225586 0.631873
7 0.209473 0.653613
8 0.196381 0.671709
9 0.185471 0.687084
10 0.176197 0.700395
13 0.154981 0.73163
16 0.13995 0.754463
19 0.128585 0.77213
22 0.119604 0.786346
26 0.110116 0.801613

Thursday, June 17, 2010

Bit twidding vs lookup tables

In semiconductor and RTL design using a pipeline of operations is much faster than a LUT(Look up table). However in software the result is flipped and small scale LUTs cut the complex sequence of operations to a single simplified action. In short memory lookup tables are generally much faster despite the chance of cache and memory glitches. Since im a semiconductor background the result always seems to surprise me when I revisit it.

The output is
speed of twiddling: 1482
speed of a lookup table: 874

The test code was:
#include <iostream>
#include <algorithm>
#include <iterator>
using namespace std;

// ********************* BIT TWIDDLER ********************* //

#define SUM_MASK1 0x55555555
#define SUM_MASK2 0x33333333
#define SUM_MASK3 0x07070707
#define SUM_MASK4 0x000f000f
#define SUM_MASK5 0x0000001f

unsigned int SumOfOnesTwiddle(unsigned int x)
{
  x = (SUM_MASK1 & (x >>  1)) + (SUM_MASK1 & x);
  x = (SUM_MASK2 & (x >>  2)) + (SUM_MASK2 & x);
  x = (SUM_MASK3 & (x >>  4)) + (SUM_MASK3 & x);
  x = (SUM_MASK4 & (x >>  8)) + (SUM_MASK4 & x);
  x = (SUM_MASK5 & (x >> 16)) + (SUM_MASK5 & x);

  return x;
}

void SumOfOnesTwiddleArray(unsigned int* in, unsigned int* out, unsigned int size)
{
  for(int i=0;i<size;i++)
    out[i] = SumOfOnesTwiddle(in[i]);
}

// ********************* LOOKUP TABLE ********************* //

#define TABLE_SIZE  256
unsigned int table[TABLE_SIZE];

void computeTable()
{
  for(int i=0; i<TABLE_SIZE; i++)
      table[i] = SumOfOnesTwiddle(i);
}

unsigned int SumOfOnesTable(unsigned int x)
{
  return  table[(x    ) & 0xff] +
          table[(x>>8 ) & 0xff] +
          table[(x>>16) & 0xff] +
          table[(x>>24) & 0xff];
}

void SumOfOnesTableArray(unsigned int* in, unsigned int* out, unsigned int size)
{
  for(int i=0;i<size;i++)
    out[i] = SumOfOnesTable(in[i]);
}

//testing

template<class itor>
void printOut(itor start, itor end)
{
  cout << hex << "0x";
  copy (start, end, ostream_iterator<int>(cout, " 0x"));
  cout << endl << endl;
}

#define SIZE 5000
int main()
{
  unsigned int indata[SIZE];
  unsigned int outdata[SIZE];
  unsigned int outdata2[SIZE];

  clock_t start_time, end_time;
  computeTable();

  srand(time(NULL));
  for(int i=0;i < SIZE; i++)
    indata[i] = rand();

  printOut(indata, indata+SIZE);
  SumOfOnesTwiddleArray(indata, outdata, SIZE);
  SumOfOnesTableArray  (indata, outdata2, SIZE);
  printOut(outdata,  outdata+SIZE);
  printOut(outdata2, outdata2+SIZE);

  for(int i=0;i < SIZE; i++)
    if(outdata[i] != outdata2[i])
       cout << "BUG!" << endl;

  start_time = clock();
  for(int i=0;i < SIZE; i++)
    SumOfOnesTwiddleArray(indata, outdata, SIZE);
  end_time = clock();

  cout << dec << "speed of bit twiddling: " << (end_time - start_time) << endl;

  start_time = clock();
  for(int i=0;i < SIZE; i++)
    SumOfOnesTableArray(indata, outdata, SIZE);
  end_time = clock();
  cout << dec << "speed of a lookup table: " << (end_time - start_time) << endl;

}

Reservoir Sampling -- for an stl container..

Cant sleep.. Here is the Reservoir sampler rebuild for an stl container I assume you can see how to extend it to whatever STL compliant container is desired.

//compile using g++
#include <iostream>
#include <algorithm>
#include <iterator>

#include <vector>
#include <set>
//#include <unordered_map> 

using namespace std;

template<class Data>
class ReservoirSampler
{
  vector<Data>* samples;
  int size;
  int count;

public:
  vector<Data>* getSamples()
  { return samples;  }
  
  ReservoirSampler(int _size)
  {
    count = 0;
    size  = _size;
    samples = new vector<Data>();
    samples->resize(_size);
  }

  ~ReservoirSampler()
  {}

  void operator()(Data sample)
  {
    if(count < size)
      (*samples)[count] = sample;
    else if((rand()%count) < size)
      (*samples)[rand()%size] = sample;
    ++count;
  }
};

template<class Data>
class RandomSamplerSet : public set<Data>
{
public:
  void add(Data data)
  { 
    //O(logN)
    insert(data); 
  }
  
  void remove(Data data)
  {
    //O(logN)
    erase(data);
  }
  
  vector<Data>* sample(int k)
  {
    //O(N)
    ReservoirSampler<Data> sampler(k);
    for_each(this->begin(), this->end(), sampler);
    return sampler.getSamples();
  }
};

#define LIMIT 1000
void createRandSamples(RandomSamplerSet<int>* container, int size)
{
  for(int i = 0; i < size; i++)
    container->add(rand()%LIMIT);
}

template<class itor>
void printOut(itor start, itor end)
{
  // *copy* the data into the output stream
  copy (start, end, ostream_iterator<int>(cout, " ")); 
  cout << endl;
}

#define SIZE    20
#define SAMPLES 5
int main()
{
  srand(time(NULL));

  RandomSamplerSet<int> data;
  createRandSamples(&data, SIZE); 
  printOut(data.begin(), data.end());

  vector<int>* samples = data.sample(SAMPLES);
  printOut(samples->begin(), samples->end());
  delete samples;
}

Reservoir Sampling - K samples from an infinite stream

Question;
You have a stream of infinite queries (ie Google searches). Describe how you would find K uniformly distributed samples and write code for it.

Reservoir Sampling is an algorithm designed to take K samples from a stream of samples of unknown size in linear time.

How it works.
The idea is simple. Firstly note that until the samples appear you don't know the total number of samples. To avoid this problem you fill up a reservoir of samples and then adjust it as each new sample appears.

First record samples until you have filled the reservoir(K samples). Once you have the first K samples the probability of each sample being in the final result was K/K or 100%.

Now when the K+i sample appears the probability of the new sample appearing in one _PARTICULAR_ output slot becomes 1/K+i. Since there are K potential output slots there is K/(K+i) chance of it being in the output and i/(K+i) chance that it wont be.

So choosing whether or not to add the new sample is easy we toss the dice. If we decide that the new sample is in the output then one of the existing items needs to be removed. Since all the items that where added to output and have equal chance of being present they also have equal chance of being removed. Hence we can simply choose to remove one of the existing samples with a 1/K probability.

This make sense logically but how do we prove it.
  • Assuming the prior stage worked correctly each element in the output should have K/(K+i-1) chance of being present.
  • There is i/(K+i) chance that prior output will be unchanged this round. Hence each element has Ki/(K+i)(K+i-1) chance of being output if the unchanging choice is made for this stage.
  • There is K/(K+i) chance that this something in the output will change
    • And for each element there is (K-1)/K chance it will not be replaced thus a total of (K-1)/K * K/(K+i) * K/(K+i-1) chance it will not be removed.
    • So conversely there is a 1/K * K/(K+i) * K/(K+i-1) chance that it will be replaced.

Hence the net chance of the existing elements making it to the next round is the sum of the chance that nothing changed with the chance that it was not the one that changed.
= Ki/(K+i)(K+i-1) + (K-1)K/(K+i-1)(K+i)
=> (Ki + (K-1)K)/(K+i-1)(K+i)
=> K(i+K-1)/(K+i-1)(K+i)
=> K/(K+i)
Q.E.D.

And of course the chance of the new element making to the output was K/(K+i). As you can see it has nicely balanced out for the possible output items for this stage.

BUT HERE IS THE CATCH.. Note that the mathematics/algorithm considered only the choice between output and non-output state of the samples. IT MADE NO STATEMENT AS TO THE OUTPUTS ORDERING. Hence the output samples need to be randomly shuffled in the final stages of the algorithm or there will be an issue with the ordering of the samples.

The code:
//compile using g++
#include <iostream>
using namespace std;

int reservoirSample(int sample, int* samples, int size, int count)
{
  if(count < size)
    samples[count] = sample;
  else
    if((rand()%count) < size)
      samples[rand()%size] = sample;
  
  return ++count;
}

#define SIZE 10
// #define STREAM_AVERAGE 300
#define STREAM_AVERAGE 10
int main()
{
  int count = 0;
  int samples[SIZE];
  int sample;
  int i = 0;

  srand(time(NULL));

  cout << "Sample Stream: " << endl;
  while(
        (count < SIZE) || 
        (rand()%STREAM_AVERAGE > 0)
        )
    {
      sample = rand()%1000;
      cout << sample << " ";

      count = reservoirSample(sample, samples, SIZE, count);
    }
  cout << endl;
  cout << "Total samples: " << count << endl;

  cout << "Output samples: " << endl;
  for(i = 0;i < SIZE;i++)
    cout << samples[i] << " ";
  cout << endl;
}

Wednesday, June 16, 2010

tcpdump command notes

To dump all the lead packet data and ip/tcp headers in/out of the box (ie if suspect the box was hacked and is transmitting data..)

sudo tcpdump -Xx

If you track it to a url then you can filter it out a bit better with commands like this.. (the tcpdump man has many good filter examples.)
sudo tcpdump dst host www.google.com

here are the wiki entries with the packet defination
IP header
Tcp header

Rails - array of checkboxes

Here is a quick hacky way to make an array of check boxes for a rails form.

Warning: This version of the code hasnt been tested. It most likely needs some debug.
class ModelB  < ActiveRecord::Baseclass
  attr_accessor :name
  attr_accessor :value
end

class ModelA < ActiveRecord::Baseclass
   has_many :model_bs
end

<% form_for( @model_a,
             :url  => {:action => 'new' }, 
             :html => {:method => :post } ) do |form| %>
  <ul class="checkbox_array">
    <% ModelB.find(:all, :order => 'position asc').each do |item| %>
    <li>
      <%= check_box_tag 'model_a[model_b_ids][]', item.id,
          form.object.model_bs.include?(item), 
          { :id => "model_a_model_b_ids_#{item.id}" } %>
      <label for="<%= "model_a_model_b_ids_#{item.id}" %>"><%= h(item.name) %></label>
    </li>
    <% end %>
  </ul>
  <%= submit_tag t('New') %>
<% end %>

rails - Executing a single testcase

To direct run a single rails test case;
1) Run the full test system from rake stop it and remove the lead items up to the rake_test_loader.rb part of the command line.
2) Then add on the test case file from the rails app root dir.
3) The with your editor pull the test_... method name of the test and add a "-n..." (without a space) to the end of the line.

The resultant command line looks something like this.

/usr/bin/ruby1.8 -I"lib:test" "<gems_dir>/rake-0.8.7/lib/rake/rake_test_loader.rb" "test/functional/<test_case_file>.rb" -n<test_def_name>

Auto updating a debian based machine via crontab

Auto updating a debian based machine via crontab

30 0 * * * export PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin: /usr/bin/apt-get update -qq; /usr/bin/apt-get -y upgrade | /usr/bin/mail -s "server updated" your@email.address > /dev/null

Tuesday, June 15, 2010

Ruby get all global constants

Object.constants.sort.uniq
Object.const_get("RUBY_VERSION") 

Object can of course be replaced with any class to see its local constants

Monday, June 14, 2010

5 pirates and 100 gold coins...

Question:
There are five pirates, ranked from 5 to 1 in descending order. The top pirate has the right to propose how 100 gold coins should be divided among them. But the others get to vote on his plan, and if fewer than half agree with him, he gets killed. How should he allocate the gold in order to maximize his share but live to enjoy it? (Hint: One pirate ends up with 98 percent of the gold.)

Answer on the web:
The normal solution is to reduce and recurse.
At 2 pirates, number 2 gets all the gold and votes for himself and stays alive.
At 3 pirates, number 3 gets 99 coins and bribes 1 who will get nothing otherwise with 1 coin.

At 4 pirates,
* If 4 brides 1 with 1 coin then 1 might axe him anyway since he will get 1 coin when there are 3 pirates anyway
* but when there are 3 pirates 2 gets nothing so if 2 is bribed now it will work out.
So At 4 pirates, number 4 gets 99 coins and bribes 2 who will get nothing otherwise with 1 coin.

At 5 pirates,
* When there are 4 pirates 1 and 3 get nothing so bribing them with 1 coin will do
So At 5 pirates, number 5 gets 98 and 1 and 3 get 1 each.

However I believe this to be wrong.. Here is why:
At 5 pirates, What happens when if pirate 4 tries to counter bribe pirates 1 and 3.
My reasoning is simple. If 1 and 3 and smart enough to figure out the the forward case is when there are 4 pirates then isnt 4 also capable of that also? So in short the actions of 4 and 2 should also count into the 5 pirate calculation.

Lets start the reduction again.
At 2 pirates, number 2 gets all the gold and votes for himself and stays alive.

At 3 pirates,
* originally number 3 gets 99 coins and bribes 1 who will get nothing otherwise with 1 coin.
* however number 2 will counter bribe 1 with 2 coins and his reward would be 98.
* 3 needs to now better this also well as the normal case or die,
* additionally 2 can offer upto 100 gold in counter bribes to 1 and then he will win all future deals because it will become the 2 pirate case, which he always wins.
So at 3 pirates, 3 and 1 get 0 coins and 2 gets 100 coins. and maybe 3 gets to live

At 4 pirates things get crazy...
* originally number 4 gets 99 coins and bribes 2.
* but in the counter bribe case 3 knows that he wont get any thing and is likely to die if it reduces to 3 pirates.
* 2 knows that if it becomes 3 or 2 pirates he will always win.
* 2 cant bribe 3 because 3 knows he gets stiffed in the 3 pirate case so bribing 1 is also pointless
* 1 doesn't care because he is getting stiffed in all cases anyway
* 4 can offer 3 1 more gold than he would get in the the 3 pirate case
So at 4 pirates, 4 gets 99 gold and 3 gets 1 gold.

At 5 pirates (still here ... great!)
* originally number 5 gets 98 coins and bribes 3 and 1 with 1 each, however when counter brides exist.....
* 4 could get 99 if he was in control but would be in-control because of 3
* 4 needs 1 vote to live but 5 needs 2
* 3 is getting 1 coin only if 4 is in-charge and doesn't want there to be 3 pirates
* 2 wants to get the number of pirates down to 2 or 3 pirates but 4 and 3 can stop that from happening so he is back to no coins and possibly a short term bribe..
* 5 is basically stuffed like 3 was, but he needs 2 people to vote him or he dies.
* if 5 offers 1,2 or 3 a bribe up to 99 coins then 4 can still better that bride with 100 coins and survive that round but in the future 4 will not give anything to anyone but 3.
* 5 can only bribe 4 or 4s counter brides will undo him.
* at best 5 could only repeat 4's option and hope that 4 doesn't axe him.
So at 5 pirates, 4 gets 99 and 3 gets 1

Yea that was crazy....

Ruby script converting to uppercase first char in a "" delimitered string

Here is some ruby script to convert the first charter of each word to uppercase in a sub string delimited by "" on each line

#!/usr/bin/ruby
File.open(ARGV[0], "r") do |infile|
  infile.each_line do |line|
    # / line
    md = /(.*)\"(.*)\"(.*)/.match(line)
    if md
      mid = md[2] 
      mid.gsub!(/ ([a-z])([a-z]*)/) {
        " " + $1.upcase + $2
      }
      line = md[1] + '"' + mid + '"' + md[3]
    end
    puts line
  end
end

Sunday, June 13, 2010

SLT algorithm basics

STL algorithms are very powerful. They have 2 fundamental components.

The Algorithms:
  • These are the outer flow of control functions that direct the action of the operator classes. In general the internals are some form of for loop that passes over a series of STL compatible iterators and performs an action.
  • A list of them is here: http://www.cplusplus.com/reference/algorithm/

The Function/Operator classes
  • These are a class that implement the operator() overloader to provide the listed functionally.
  • Within this group of objects are an important subgroup of wrapper operators. These wrappers are designed to expand or covert the abilities of the other operators and to adapt existing function and class members for use with the STL algorithms.
  • A list of them is here: http://www.cplusplus.com/reference/std/functional/

I think I need to break this down and approach it in smaller chunks and apologies for the template heavy code. It cut down my typing.
#include <iostream>
#include <list>
#include <string>

#include <algorithm>
#include <functional>
#include <iterator>
using namespace std;

class RandN
{
  int n;
public:
  RandN(const int _n) { n = _n; }

  int operator() () const { return rand()%n; }
};

void readInAFIle(string filename)
{
  vector<int> values;
  ifstream file (filename);
  if (file)
    copy(istream_iterator<string> (file), istream_iterator<string>(), back_inserter (values));
}

template<class itor>
itor filterToThreshold(itor start, itor end, int threshold)
{
  //move to the end if... why call did they call it "remove"...
  return remove_if(start, end, bind2nd(greater<int>(), threshold));
}

template<class itor>
bool checkIfPresent(itor start, itor end, int value)
{
  return find(start, end, value) != end;
}

template<class itor>
void createRandSamples(itor start, int size, int limit)
{
  generate_n(start, size, RandN(limit));  
}

template<class itor>
void printOut(itor start, itor end)
{
  // *copy* the data into the output stream
  copy (start, end, ostream_iterator<int>(cout, " ")); 
  cout << endl;
}

template<class itor> 
void doDemo(itor start, itor end, string kind)
{
  cout << endl;
  cout << "Working on a " << kind << "..." << endl;
  printOut(start, end);
  cout << (checkIfPresent(start, end, 5) ? "there is a 5" : "No 5's") << endl;
  
  itor splitPoint = filterToThreshold(start, end, 5);
  int count = distance(start, splitPoint);
  cout << kind << " was filtered to a count of " << count << endl;
  printOut(start, splitPoint);
}

int main()
{
  const int SIZE = 20;
  list<int> valuesList;
  int valuesArray[SIZE];

  //create a rand list of stuff...
  srand(time(NULL));
  createRandSamples(valuesArray, SIZE, 10);
  createRandSamples(back_inserter(valuesList), SIZE, 10);

  doDemo(valuesArray, valuesArray + SIZE, "Array");
  doDemo(valuesList.begin(), valuesList.end(), "List");
}

Stroustrups C++ FAQ

Bjarne Stroustrup C++ FAQ

Const vs Mutable, Pointers, Reinit Constructors and Statics

The "const" and "mutable" keyword
  • "const" means that the data related to the variable is to be left constant.
  • Only constant functions can be called on a const object.
  • "mutable" means that the data within a constant can be modified without changing the object ie the object is logically constant.
  • "mutable" is often abused. Its correct use is to allow for things like access counters and other data items that are not directly related to the objects real data. For example making the size of a toy class mutable is an abusive usage.
  • const tends to be referred to as a poison pill by in-experienced programmers who don't know about const_cast and how it works. Correct use of constants and const_cast can produce so very interesting coding patterns.
Constant Data, Pointers and references
const int  value0 = 1;       //constant data
const int* value1 = &value0; //a constant pointer
const int& value2 = value0;  //a constant reference
  • The Constant data indicates that the contents of the object or inbuilt type are not modifiable
  • The Constant pointer and references define that the data the pointer refers to is constant
Pointers which are constant
char* const string = "abc";  //a pointer constant
  • The Pointer Constant define that the pointer its self is not modifiable but the data is free to be messed with.
Const member functions
void function() const;
  • When "const" is applied to member functions it means that the action of the function will not change the object.
  • members of the object marked as "mutable" can me modified and maintain the "const" status of the object.
  • Constant member functions can not return non-constant references to members of the constant class.
  • Nothing prevents the constant member function from modifing items outside the objects local scope such as static members of the class.
  • By extension a const member function can const_cast its own this pointer and then modify the non-const version of its self! And it can also return this to the outside world. Doing this is considered bad form however in some systems bad usage of const and mutable has made it a requirement.
#include <iostream>
using namespace std;

class BaseA
{
public:
  static const int  staticConstIntA = 0; //init in global
  const   int       constA;           //init in constructor
  int               intA;
  static int        staticA;
  mutable int       mutableA;
  //const mutable int f; //BAD illogical
 
  void setA_Const(int _in) const
  { 
    //staticConstIntA = _in; //BAD const is hard set
    //constA = _in;          //BAD const is already hard set
    staticA  = _in;
    //intA     = _in;        //BAD a const member is modifing the object
    mutableA = _in;
  }

  void setA(int const _in)
  { 
    //staticConstIntA = _in; //BAD const is hard set
    //constA = _in;          //BAD const is already hard set
    staticA  = _in;
    intA     = _in;
    mutableA = _in;
  }

  //BAD a const member is directly returning a modifable piece of itself  
  //int& getA_Const(int _in) const
  //{ return intA; }

  int& getA_UnConst() const
  { 
    BaseA* self = const_cast<BaseA*>(this);
    self->intA = 10;
    return self->intA;
  }
  
  BaseA(int _in) : constA(_in) 
  {
    //staticConstIntA = _in; //BAD const is hard set
    //constA = _in;          //BAD const is already hard set
    staticA  = _in;
    intA     = _in;
    mutableA = _in;
  }
};

int BaseA::staticA; 

int main()
{
  // *********normal class*************** //
  BaseA normalA(1);

  //normalA.constA = 10;         //BAD member is const
  //BaseA::staticConstIntA = 10; //BAD member is const
  normalA.intA = 1;
  normalA.mutableA = 1;

  normalA.setA(10);       
  normalA.setA_Const(10); 
  
  // *********constant class*************** //
  const BaseA  constA(3);

  //constA.intA = 10;    //BAD entire class is const making normal members also const
  constA.mutableA = 10;  //OK a mutable member can be modifed.

  //constA.setA(10); //BAD member function is not const
  constA.setA_Const(10); //GOOD  member function is const, and modifing a const class 

  // *********constant refed class*************** //
  const BaseA& refA = normalA;

  //constA.intA = 10;    //BAD entire class is const making normal members also const
  refA.mutableA = 10;  //OK a mutable member can be modifed.

  //constA.setA(10); //BAD member function is not const
  refA.setA_Const(10); //GOOD  member function is const, and modifing a class static

  // *********pointers to const class*************** //
  BaseA* ptrA = new BaseA(2);
  const BaseA* constPtrA = new BaseA(4);

  //constPtrA->intA = 1;    //BAD a const pointer means the DATA not the POINTER 
  constPtrA->mutableA = 1;

  //constPtrA->setA(10);   //BAD member function is not const
  constPtrA->setA_Const(10); //GOOD  member function is const, and modifing a const class

  constPtrA = ptrA;         //GOOD a const pointer means the DATA not the POINTER
  //const BaseA* ptrB = new(constPtrA) BaseA(20); //BAD reinit constructors dont work on const pointers

  // *********pointers which is const to class*************** //
  BaseA* const ptrConstA = new BaseA(5);

  ptrConstA->intA = 1;     //GOOD a pointer const means the POINTER not the DATA
  ptrConstA->mutableA = 1;

  ptrConstA->setA(10);       //GOOD a pointer const means the POINTER not the DATA
  ptrConstA->setA_Const(10); //GOOD a pointer const means the POINTER not the DATA

  //ptrConstA = ptrA;       //BAD a pointer const means the POINTER not the DATA
  const BaseA* ptrB = new(ptrConstA) BaseA(20); //GOOD reinit constructors DO work on pointers which are constant

  // *********UNLOCKING the const posion pill*************** //
  const BaseA  constB(3);  
  cout << "before unlocking: " << constB.intA << endl;
  int& unlockedA = constB.getA_UnConst();
  cout << "after unlocking: " << constB.intA << endl;
}

const casting

The "const_cast" keyword.
  • Operates on references and pointers only.
  • When casting into a const the type is simply set as const.
  • When casting out of a const on a fundamental data type a copy of the variable is made on the FIRST local removal of the const only. This is to move the variable from the um-manipulable BSS/data segment into the current stack area.
  • When casting out the const on class the const status is simply removed and the object becomes directly modifiable.

//compile with g++
#include <iostream>
using namespace std;

class BaseA
{
public:
  int a;

  BaseA()         {a = 0;   cout << "BaseA create" << endl; }
  BaseA(BaseA& o) {a = o.a; cout << "BaseA copy" << endl; }
};

void print(int& a)
{ 
  cout << "print(int a) step1:" << a++ << endl; 
  cout << "print(int a) step2:" << a++ << endl; 
}

void print(int* a)
{ 
  cout << "print(int a) step1:" << (*a)++ << endl; 
  cout << "print(int a) step2:" << (*a)++ << endl; 
}

void print(const int& a)
{ cout << "print(const int a):" << a << endl; }

void print(BaseA& a)
{ 
  cout << "print(BaseA& a) step1:" << a.a++ << endl; 
  cout << "print(BaseA& a) step2:" << a.a++ << endl; 
}

void print(const BaseA& a) 
{ cout << "print(const BaseA& a):" << a.a << endl; }


int main()
{
  int       int1 = 100;
  const int int2 = 200;
  
  //print(const_cast<const int>(int1)); //BAD only works on pointers and references
  //print(const_cast<int>(int2));       //BAD only works on pointers and references

  cout << "const references" << endl;
  const int& refInt1 = const_cast<const int&>(int1); 
  int&       refInt2 = const_cast<int&>(int2);  //OK but get the value from the BSS and load it into the stack   

  refInt2++;
  print(refInt1); 
  print(refInt2); 

  cout << "Original: int1 = 100; is NOW: "  << int1 << endl;
  cout << "Original: int2 = 200; is NOW: "  << int2 << endl;

  cout << "Cast: refInt1; is NOW: "  << refInt1 << endl;
  cout << "Cast: refInt2; is NOW: "  << refInt2 << endl;
  cout << endl;

  cout << "const pointers" << endl;
  int*       ptrInt2 = const_cast<int*>(&int2);  //OK but get the item was already loaded in to the stack
  cout << "Cast: ptrInt2; is NOW: "  << *ptrInt2 << endl;

  print      (ptrInt2); 

  cout << "Original: int2 = 200; is NOW: "  << int2 << endl;
  cout << "Cast: ptrInt2; is NOW: "  << *ptrInt2 << endl;

  cout << endl;

  cout << "const classes" << endl;
  BaseA       a1;
  const BaseA a2;

  const BaseA& refA1 = const_cast<const BaseA&>(a1);
  BaseA&       refA2 = const_cast<BaseA&>(a2);  

  refA2.a++;
  print(refA1); 
  print(refA2); 

  cout << "Original: a1.a = 0; is NOW: "  << a1.a << endl;
  cout << "Original: a2.a = 0; is NOW: "  << a2.a << endl;

  cout << "Cast: refA1.a; is NOW: "  << refA1.a << endl;
  cout << "Cast: refA2.a; is NOW: "  << refA2.a << endl;
}

reinterpret casting

The "reinterpret_cast" keyword.
  • Operates on references and pointers.
  • It results in only the direct coping of the pointer/reference and nothing more. This means that the exact value being pointed to is unchanged and will be interpreted as the new type directly.
  • The exact way the contents of the pointer are interpreted after conversion is very system dependent.
  • This cast is normally used to convert c++ data structures into something usable by a 3rd party API or lib, or in buffer pools and variable data type implementations. If a chunk of data has multiple interpretations to (like IP headers etc) then a "union" is better choice.

//compile with g++
#include <iostream>
using namespace std;

class BaseA
{
  virtual void print() { cout << "BaseA"; }
};

class DerivedA1 : public BaseA
{
  int a;
  virtual void print() { cout << "DerivedA1"; }
};

class DerivedA2 : public BaseA
{
  virtual void print() { cout << "DerivedA2"; }
};

class DerivedA3 : public DerivedA1
{
  virtual void print() { cout << "DerivedA3"; }
};

class BaseB
{
public:
  BaseB()         { }
  BaseB(BaseA& A) { cout << "Converting BaseA" << endl; }
  virtual void print() { cout << "BaseA"; }
};

int main()
{
  int    int1         = 100;
  double double1      = 3.142;

  cout << "reinterpret casting: non-conversion of data example " << endl;
  char*      char1    = reinterpret_cast<char*>(&int1);  //OK keep in mind the endianess of the system
  int*       int2     = reinterpret_cast<int*>(&double1); //OK be aware of the proceses native double formats

  cout << "reinterpret_cast<char>(int1)  " << *char1 << " <- " << int1    << endl;
  cout << "reinterpret_cast<int>(double) " << *int2  << " <- " << double1 << endl;
  cout << endl;

  BaseA*     base1    = new BaseA();
  BaseB*     baseB    = new BaseB();
  BaseA*     base2    = new DerivedA2();
  BaseA*     base3    = new DerivedA3();
  DerivedA3* derived1 = new DerivedA3();

  cout << "reinterpret casting: pointers " << endl;
  DerivedA1* derived2 = reinterpret_cast<DerivedA1*>(base1);  //BAD base1 doesnt have the correct memory to be DerivedA1 class!   
  DerivedA1* derived3 = reinterpret_cast<DerivedA1*>(base2);
  DerivedA2* derived4 = reinterpret_cast<DerivedA2*>(base2);
  DerivedA1* derived5 = reinterpret_cast<DerivedA1*>(base3);
  BaseA*     base4    = reinterpret_cast<BaseA*>(derived1);   //OK but overkill

  char*      charPtr  = reinterpret_cast<char*>(&int1);     //OK but be careful int and char dont share any inhertance.
  BaseB*     base5    = reinterpret_cast<BaseB*>(derived1); //BAD BaseB and dervied1 have conflicting memory layouts 

  cout << "reinterpret casting: references " << endl;
  DerivedA1& refDerived1    = reinterpret_cast<DerivedA1&>(*base1); //BAD base1 doesnt have the correct memory to be DerivedA1 class! 
  DerivedA1& refDerived2 = reinterpret_cast<DerivedA1&>(*base2);
  DerivedA2& refDerived3 = reinterpret_cast<DerivedA2&>(*base2);
  DerivedA1& refDerived4 = reinterpret_cast<DerivedA1&>(*base3);
  BaseA& refBase         = reinterpret_cast<BaseA&>(*derived1);
  BaseB& refBaseB        = reinterpret_cast<BaseB&>(*derived1); //BAD BaseB and dervied1 have conflicting memory layouts 

  delete base1;
  delete baseB;
  delete base2;
  delete base3;
  delete derived1;
}

static typecasting in C++

The "static_cast" keyword.
  • Operates on references and pointers of class and on the inbuilt data types and class with conversion constructors.
  • In the case of inbuilt data types it converts the value into the new type.
  • In the case of conversion constructor classes, it creates a new object from the original object using the conversion constructor. Note that copy constructors ARE conversion constructors and WILL be implicitly apply to any derived class.
  • In the case of pointers and references:

    • The cast confirms at compile time that the pointers convert from and to are are related by inheritance, ie they share the same Base class a some point.
    • Since the cast performs no RTTI check the it is faster and lighter than the "dynamic_cast".
    • If used badly on pointers or references it can convert invalidly, for example a base class can be converted to a Derived class despite the fact that it wont have memory for the derived classes member variables.

//compile with g++
#include <iostream>
using namespace std;

class BaseA
{
  virtual void print() { cout << "BaseA"; }
};

class DerivedA1 : public BaseA
{
  int a;
  virtual void print() { cout << "DerivedA1"; }
};

class DerivedA2 : public BaseA
{
  virtual void print() { cout << "DerivedA2"; }
};

class DerivedA3 : public DerivedA1
{
  virtual void print() { cout << "DerivedA3"; }
};

class BaseB
{
public:
  BaseB()         { }
  BaseB(BaseA& A) { cout << "Converting BaseA" << endl; }
  virtual void print() { cout << "BaseA"; }
};

int main()
{
  int    int1         = 100;
  double double1      = 3.142;

  cout << "static casting: inbuilt datatypes " << endl;
  char       char1    = static_cast<char>(int1); 
  int        int2     = static_cast<int>(double1); 

  BaseA*     base1    = new BaseA();
  BaseB*     baseB    = new BaseB();
  BaseA*     base2    = new DerivedA2();
  BaseA*     base3    = new DerivedA3();
  DerivedA3* derived1 = new DerivedA3();

  cout << "static_cast<char>(int1)  " << char1 << " <- " << int1    << endl;
  cout << "static_cast<int>(double) " << int2  << " <- " << double1 << endl;
  cout << endl;

  cout << "static casting: conversion constructor classes " << endl;
  BaseB      objB     = static_cast<BaseB>(*base1);
  BaseA      objA1    = static_cast<BaseA>(*base2); //OK the default copy constructor IS a conversion constructor 
  //BaseA      objA2    = static_cast<BaseA>(*baseB); //BAD no conversion constructor 

  cout << "static casting: pointers " << endl;
  DerivedA1* derived2 = static_cast<DerivedA1*>(base1);  //BAD base1 doesnt have the correct memory to be DerivedA1 class!   
  DerivedA1* derived3 = static_cast<DerivedA1*>(base2);
  DerivedA2* derived4 = static_cast<DerivedA2*>(base2);
  DerivedA1* derived5 = static_cast<DerivedA1*>(base3);
  BaseA*     base4    = static_cast<BaseA*>(derived1);   //OK but overkill

  //char*      charPtr  = static_cast<char*>(&int1);     //BAD int and char dont share any inhertance.
  //BaseB*     base5    = static_cast<BaseB*>(derived1); //BAD BaseB and dervied1 dont share any inhertance.

  cout << "static casting: references " << endl;
  DerivedA1& refDerived1    = static_cast<DerivedA1&>(*base1); //BAD base1 doesnt have the correct memory to be DerivedA1 class! 
  DerivedA1& refDerived2 = static_cast<DerivedA1&>(*base2);
  DerivedA2& refDerived3 = static_cast<DerivedA2&>(*base2);
  DerivedA1& refDerived4 = static_cast<DerivedA1&>(*base3);
  BaseA& refBase         = static_cast<BaseA&>(*derived1);
  //BaseB& refBaseB        = static_cast<BaseB&>(*derived1); //BAD BaseB and dervied1 dont share any inhertance.

  delete base1;
  delete baseB;
  delete base2;
  delete base3;
  delete derived1;
}

Saturday, June 12, 2010

dynamic typecasting in C++

The "dynamic_cast" keyword.
  • Operates on references and pointers only.
  • Requires the item be a class and that class to be a polymorphic class, ie it has at least one virtual function
  • Uses the RTTI information to check the capability of the conversion at runtime
  • In the case of references it will throw an exception if the conversion failed.
  • In the case of pointers it will return NULL if the conversion failed.

//compile with g++
#include <iostream>
using namespace std;

class BaseA 
{
  virtual void print() { cout << "BaseA"; }
};

class DerivedA1 : public BaseA 
{
  virtual void print() { cout << "DerivedA1"; }
};

class DerivedA2 : public BaseA 
{
  virtual void print() { cout << "DerivedA2"; }
};

class DerivedA3 : public DerivedA1
{
  virtual void print() { cout << "DerivedA3"; }
};

int main()
{
  int int1 = 100;
  BaseA*     base1 = new BaseA();

  //derived class dont need casting into the base
  BaseA*     base2    = new DerivedA2();
  BaseA*     base3    = new DerivedA3();
  DerivedA3* derived1 = new DerivedA3();

  cout << "dynamic casting: pointers " << endl;
  //char*      char1    = dynamic_cast<char*>(&int1); //BAD requires a class
  DerivedA1* derived2 = dynamic_cast<DerivedA1*>(base2);
  DerivedA2* derived3 = dynamic_cast<DerivedA2*>(base2);
  DerivedA1* derived4 = dynamic_cast<DerivedA1*>(base3);
  BaseA*     base4    = dynamic_cast<BaseA*>(derived1); //OK but overkill
  
  cout << "dynamic_cast<DerivedA1*>(base2); " << (derived2 ? "PASSED" : "FAILED") << endl;
  cout << "dynamic_cast<DerivedA2*>(base2); " << (derived3 ? "PASSED" : "FAILED") << endl;
  cout << "dynamic_cast<DerivedA1*>(base3); " << (derived4 ? "PASSED" : "FAILED") << endl;
  cout << "dynamic_cast<BaseA*>(derived1);  " << (base4 ? "PASSED" : "FAILED") << endl;
  cout << endl;

  cout << "dynamic casting: references " << endl;
  cout << "dynamic_cast<DerivedA1&>(*base2) ";
  try
    { 
      DerivedA1& refDerived = dynamic_cast<DerivedA1&>(*base2); 
      cout << "PASSED" << endl; 
    }
  catch (...)
    { cout << "FAILED" << endl;  }
    
  cout << "dynamic_cast<DerivedA2&>(*base2) ";
  try
    {
      DerivedA2& ptrDerived = dynamic_cast<DerivedA2&>(*base2); 
      cout << "PASSED" << endl;
    }
  catch (...)
    { cout << "FAILED" << endl;  }

  cout << "dynamic_cast<DerivedA1&>(*base3) ";
  try
    { 
      DerivedA1& refDerived = dynamic_cast<DerivedA1&>(*base3); 
      cout << "PASSED" << endl; 
    }
  catch (...)
    { cout << "FAILED" << endl;  }

  cout << "dynamic_cast<BaseA&>(*derived1)  ";
  try
    { 
      BaseA& refBase = dynamic_cast<BaseA&>(*derived1); 
      cout << "PASSED" << endl; 
    }
  catch (...)
    { cout << "FAILED" << endl;  }

  delete base1;
  delete base2;
  delete base3;
  delete derived1;
}

Convert code into Blogger postable code.

http://www.accessify.com/tools-and-wizards/developer-tools/quick-escape/default.php

http://francois.schnell.free.fr/tools/BloggerPaste/BloggerPaste.html

new/delete vs new[]/delete[]

The "new" keyword
  • "new" will first created the memory for 1 object and then execute the constructors chain starting from the Derived class recursing to the very Base class and then executing the constructor bodies back up to the Derived.
  • New will throw the std::bad_alloc exception if there isnt enough memory. The exception can be suppressed with "new(nothrow) <object>"
  • Once allocated the size of memory can not be changed as malloc can be with realloc
  • Once allocated the memory can be reinited with "new(<pointer>) <class>". This should only be done in very special cases such as a buffer pool, custom memory management system or something like the vector class. DONT MESS WITH THIS UNLESS YOU HAVE A CLUE.

The "delete" keyword
  • The delete will first call all destructors and then deallocate memory for 1 object.
  • It is safe to pass NULL into delete as it will return directly and perform nothing.
  • Throwing exceptions from inside a delete is technical possible and will result in the abortion of the destructor sequence and failure to deallocate the memory. It is considered bad form because most destructor code doesn't expect to re-called if a failure occurred.
  • delete will not be implicitly called at the end of a program, but the memory will be reclaimed by the system.
  • calling delete on a "new[]" allocated block of data will execute the destructor of the first object only and fail to correctly deallocate the memory. This is deliberating done to support the rebasing new constructor described above. DONT MESS WITH THIS UNLESS YOU HAVE A CLUE.

The "new[]" keyword
  • "new[]" logically the same as "new" but will create and array of memory, and for each will call the constructors like "new" does
  • Items in a "new[]" allocated array can be reinited with the "new" keyword. DONT DO THIS UNLESS YOU REALLY HAVE A CLUE!
  • memory allcated with "new[]" must be deleted with "delete[]", if "delete is used the first items destructor will be called and the memory release will fail. This is to support the reiniting constructor above. DONT MESS WITH THIS UNLESS YOU HAVE A CLUE.

The "delete[]" keyword
  • "delete[]" operates logically the same as the "delete" but removes an array of memory
  • "delete[]" will crash out if handed memory that was allocated by "new"

#include <iostream>
using namespace std;

class Counting
{
public:
  static int total_count;
  int count;
  
  Counting() 
  { 
    count = total_count++; 
    cout << "Counting" << count << " created" << endl;
  }

  ~Counting() 
  { 
    cout << "Counting" << count << " destroyed" << endl;
  } 
};

int Counting::total_count = 0;

class ThrowingDestructor
{
public:
  int count;
  ThrowingDestructor() { count = 2; }
  ~ThrowingDestructor() { count--; if(count > 0) throw "STOP NOW!"; }
};

int main()
{
  ThrowingDestructor* thrower = NULL;

  //this is technically ok but bad form!
  thrower = new ThrowingDestructor(); 
  try 
    { delete thrower; }
  catch (...)
    { cout << "catch1!" << endl; }
  try 
    { delete thrower; }
  catch (...)
    { cout << "catch2!" << endl; }

  Counting* count  = NULL;

  delete count; //OK delete doesnt react to NULLs

  count = new Counting(); 
  delete count;
  //count = new(count) Counting(); //BAD reexecute the constructor on count which is dealloc'ed, will crash

  count = new Counting(); 
  count = new(count) Counting(); //OK reexecute the constructor on count
  delete count;

  count = new Counting[10];
  new(&count[2]) Counting(); //OK will reinit the memory for the item at index 2
  delete[] count;  

  count = new Counting[10];
  delete count;    //BAD will call destrutor of the first element only, and delete will not remove the memory.
  //count = new(count) Counting(); //will not CRASH proving that the memory is still allocated
  delete[] count;  //BAD will now double free the memory and call the first elements destructor again!!!

  count = new Counting();
  //delete[] count;  //BAD will attempt to free space that has not been allocated as an array and crash 
}

Pure Virtual Functions

Pure virtual functions should be used;
  • to create an abstract object that is not supposed to be instantiated
  • to force a compile time check of the derived classes to confirm it has implemented the function

#include <iostream>
using namespace std;

class Base
{
public:
  virtual void print_me() = 0;
};

class DerivedBad : public Base
{
public:
  
};

class DerivedGood : public Base
{
public:
  virtual void print_me() { cout << "I am Good" << endl; };  
};

int main()
{
  //Base* a = new Base();       //BAD it has a pure virtual function and is therefoe an abstract class.
  //Base* b = new DerivedBad(); //BAD it fails to implement print_me
  Base* c = new DerivedGood(); //GOOD it fully implements the interface to Base
  c->print_me();

  delete c;
}

Template Vs Macro

  • Macros just expanded out by the compiler, which means code is copied around.
  • Macros are not type safe. A template will compare the input types and check them against the type you spec'ed when using it.
  • Macros don't obey C++'s namespaces and locality rules they can be active in unexpected places in the code. Consider what happen if someone was to make a macro "i" in a random header some where.
  • The result of the macro is what the compiler really works on so errors are more troublesome to find.. it helps to know that with gcc and g++ the "-E" command line option will stop after the pre-processor has completed.

For example;
//compile with g++ file_name
#define MIN(a,b) (((a)<(b)) ? (a) : (b))

template<class T> T min(T a,T b) { return (a<b) ? a : b; }

int main()
{
  int x = 10;
  int y = 20;
  
  int macro  = MIN(x*3,y);
  int tplate = min<int>(x*3,y);
}    

Now expand it and view the result , notice the macro duplicated its inputs;
//run g++ -E file_name
# 1 "macro_template.cpp"
# 1 ""
# 1 ""
# 1 "macro_template.cpp"

template<class T> T min(T a,T b) { return (a<b) ? a : b; }

int main()
{
  int x = 10;
  int y = 20;

  int macro = (((x*3)<(y)) ? (x*3) : (y));
  int tplate = min<int>(x*3,y);
}

Friday, June 11, 2010

Customer management tactics

Customers tend to glaze over or become micro managers when too much detailed technical information is given to them.

The best way to avoid this to simplify the interaction to a cost benefit choice. Offer the customers a choice of A for price Y or B for price Z. the customers will then focus on the price and the requirements and not the tech details.

When quoting price you should keep in mind the average customer will always take the lowest price. This means that you can lead them to make a good choice by balancing the price to push them that way. Also pricing should account for the doing, and the overhead (emails, calls especially if the customer keeps changing there mind.

Switching the editor on crontab

Switching the editor on crontab et-al permanently.

update-alternatives --list editor
update-alternatives --set editor <one_of_the_paths_listed>

determining what shared libs a program is using

To trace what libs an exe is using run this.

ldd <full_path_to_exe>

If you where to ldd ls you get this kind of result

ldd /bin/ls
   linux-vdso.so.1 =>  (0x00007fff04dfe000)
   librt.so.1 => /lib/librt.so.1 (0x00007f3c869ce000)
   libselinux.so.1 => /lib/libselinux.so.1 (0x00007f3c867b2000)
   libacl.so.1 => /lib/libacl.so.1 (0x00007f3c865ab000)
   libc.so.6 => /lib/libc.so.6 (0x00007f3c86249000)
   libpthread.so.0 => /lib/libpthread.so.0 (0x00007f3c8602d000)
   /lib64/ld-linux-x86-64.so.2 (0x00007f3c86bd7000)
   libdl.so.2 => /lib/libdl.so.2 (0x00007f3c85e29000)
   libattr.so.1 => /lib/libattr.so.1 (0x00007f3c85c25000)

Thursday, June 10, 2010

iPhone optimization -- timing your code

Direct C calls are the best in the iPhone the below test doesn't account for the time to release the date object as well but it clearly shows which is the better approach.

CFAbsoluteTime profile_start = CFAbsoluteTimeGetCurrent();

for(int i = 0; i < 100000; i++)
  CFAbsoluteTimeGetCurrent();
NSLog(@"direct C call: %0.5f", CFAbsoluteTimeGetCurrent() - profile_start);

profile_start = CFAbsoluteTimeGetCurrent();
for(int i = 0; i < 100000; i++)
  [NSDate date];

NSLog(@"NSDate call: %0.5f", CFAbsoluteTimeGetCurrent() - profile_start);

The output is;
direct C call: 0.01097
NSDate call: 0.05005

netstat -- get the current active connections

Using netstat to get the current list of active connections

netstat -antp 

purging the "rc" (release candidate) packages from a system

dpkg -l | grep '^rc'| cut -d' ' -f3 | xargs dpkg -P

Internet business models

Some food for thought..

Here are a few working biz models for online:
1) 90% free/cheap 10% premium type sites. (etrade? )
2) free for all large scale service/products with Ads support. (Google, Commerical tv)
3) world wide online shops. (Amazon)

Online take over
All information based resources like books, mags, news papers, movies will ultimately move to the Internet.

Amazon/Kindle books and iTunes are strong working biz models for it.

Consider what would happen if movies where sold at the price of a rental movie world wide. Distribution could be p2p, with the purchase giving the download key for that Moive. Once purchased any numbr of repeat downloads is valid for the user. The end result is all the money from current rental places will move into the new biz model.

Ref here for more details
http://digitalenterprise.org/models/models.html

Wednesday, June 9, 2010

Vulnerability reporting lists of note

US-CERT Technical Cyber Security Alert

http://www.us-cert.gov/cas/techalerts

Secunia

http://secunia.com/

Runing VMWare 2 console out side of firefox..

It appears that the firefox plugin of Vmware console includes the vmware remote control component!..

Here is the hack for linux to get it runing independent of firefox in ubuntu. You need to have installed the plugin into firefox first off.

cd .mozilla/firefox/<profile>/
cd extensions/VMwareVMRC@vmware.com/
cd plugins/
chmod u+x ./lib/wrapper-gtk24.sh
bin/
chmod u+x *
cd ../
./vmware-vmrc -u <user> -h <url_ip>:8333 -K