Notorious Devil: 2009

Tuesday, December 15, 2009

Signals in Linux

- List of signals: $kill -l
- You can't handle SIGSTOP and SIGKILL. You can't priorities which signal to handle.
- Signals are generated by setting the appropriate bit in the task_struct's signal field. If the process has not blocked the signal and is waiting but interruptible (in state Interruptible) then it is woken up by changing its state to Running and making sure that it is in the run queue. (linuxhq.com)
- Signals that are sent to a process if an illegal flow of execution happens, are synchronous. They are also called trap e.g. illegal memory access.
- Asynchronous signals are also called interrupts and are sent from a process to another process or thread-to-thread.

Monday, December 14, 2009

Some useful questions

- Access privileges for private functions in C++?
- Controlling exclusive access to a variable with two processor and two threads?
- Pattern substitution with sed?
- What is md5?
- When do you allocate memory for static variables?
- Phases of compiler, AST?
- When do we allocate storage for static?
- Data hiding and encapsulation?
- Is Vtable per class/object?
- Difference between COFF and ELF?
- Compiler phase of function in-lining?
- Difference between macro and function in-lining?
- Types of parsers?
- YACC is which parser?
- How lex and yacc work?
- How does linker gets info about static variables?
- How protected specifier works in C++?
- What is memory leak?
- Accessing static outside its scope?
- How assembler works?
- When do we do semantic analysis?
- You have a file with write-only access, and you have N threads. How would you ensure sharing of this file among these threads considering performance a prime concern?
- Singleton class? A real world example.
- How would you classify a file based on contents of the class? How many search keyword in the file are necessary?

Tuesday, December 1, 2009

Bouquet of questions-3

o How can you add attributes in gcc, e.g., changing function call way from cdecl to stdcall?

GCC allows you to attribute your functions with __attribute__ macro.
This macro allows you to write more readable, clean code. I liked,

- __attribute__((destructor))
- __attribute__((constructor))
- __attribute__((warning("Function definition is not found)))

There are plenty of them, check them out at: http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html

o What does following code do?

int flag = 2;
int (*fp)(void) __attribute__((cdecl));
void fun() __attribute__((warning("No definition")));
int main(void)
{
//fun();
fp = main;
puts("called");
while(flag--) {
int x = (*fp)();
}
}

Sunday, November 29, 2009

Bouquet of questions-2

o What is a bus error?

Bus refers to address bus and an error means passing an illegal address to the address bus. There are two signals that can be sent by kernel for an illegal address:
- SIGBUS
- SEGSEGV

A SIGBUS is issued for an obvious wrong virtual address, there is no address translation and CPU can outright say that address is bogus.

SIGSEGV is issued when after translating the address, CPU realizes that address is bogus.

SIGBUS is always better since it avoids address translation. But which signal to issue, depends on CPU.

o What will be the output?

main()
{
int a[10];
printf("%d", sizeof(a);
}

Friday, November 6, 2009

Bouquet of questions

o What is a cache line?

- Smallest unit of data transfer between cache and main memory.
- Finest level of granularity

o How can I get IPC information over processes?

- Use $ipcs

o Tell the value of enum's elements.

enum e_tag{ a, b, c, d=20, e, f, g=20, h
}var

- a=0, b=1, c=3, d=20, e=21, f=22, g=20, h=21

o Why use volatile?

- To avoid compiler optimization on variables involved in two cases:
+ Shared library
+ Value updated implicitly by hardware

Avoid optimization of Load/Store instructions by the compiler.
e.g., If you have a variable(X) that reads from a shared lib variable(Y), while Y is being updated by other process. Compiler would have no idea of this dependency of external value updates and it may remove these LOAD/STORE instructions during compilation.

Thursday, October 22, 2009

Digging Intel Itanium: RSE, Register Stack Engine

Itanium processors heralded a new era or processing. Though not very successful commercially, IPF has added many wonderful techniques to computer science. I'll discuss the one I liked the most.

It is called RSE (Register Stack Engine), a technique to avoid using main memory during function calls and do the stuff in processor register itself. When a function call is made, calling function passes the arguments that are saved in main memory. After this, the return address is saved. Main memory access is slower when compared to processor speed. Itanium has 128 GPR and out of those 96 are available for RSE. These 96 registers take care of function call mechanism, appearing as a register stack frame to the application. This bypasses memory access till all 96 registers are occupied. Interestingly, processor itself is responsible for running this show and also it's transparent to the application.

More here: http://software.intel.com/en-us/articles/itaniumr-processor-family-performance-advantages-register-stack-architecture

Tuesday, October 13, 2009

Fast hard drives: How?

A simple hard drive today is capable of things that sound like some outlandish technology. Just try to do some file I/O in your application and do it with many threads.
Say you have 4 threads, A,B,C, and D. And request to do I/O comes in A then B and so on. If you check the return status of these threads, the ordering might be surprising. Thread D may return before A. How?

Disk have a technology called Native Command Ordering. So they take your request in and process them on a single, simple logic:
-> Serve the one which you can do fastest.
This depends on the head position of the disk. The request that can be served with minimal movement of head, is served first.

Sunday, October 4, 2009

A few 'why' answered

There are certain 'why' that we may have missed, so I am trying to attend them...one by one.

o Why do we need hash table?
Of course, for better, faster search. It could even get us an element in O(1).

But we need to use hash because if given data is in a form which can't be ordered; we need hashing. Example can be images. How will you order a set of images and search any.
Hashing comes to help here. It generates a unique(ideally) hash key for given such data. We save these keys in a hash table. So searching an image is now searching a key in hash table. Keys are generated with a hash function. More about that later.

To be continued with more whys.

Tuesday, September 22, 2009

HP Caliper : A Profiler

Developer rarely concern for efficiency of their programming logic. It's not a hard-and-fast observation but that's how most of amateur developer write the code. But performance is becoming critical day by day and commercial applications vie for as much performance gain as possible.

Performance can be hit for motley of reasons but application logic is what we are going to stress here. You may write a code that performs badly with cache system of your hardware or you schedule your threads inappropriately. There may be many reasons that one may not become aware unless someone find it out.

HP Caliper is one of my favorite tool that can help you with finding many causes of application slowdown. It is an Intel Itanium based tool and runs on HPUX & Linux.
Talk about its feature and I may run out of space. It can make basic profiles like sampled call-graph, flat function profile, CPU events profile. Besides it can provide call-stack profile (critical for I/O bound applications), data cache profile (to help you re-layout the data structures).

Best part is that it does not need a recompile of application or any library. Just give it a binary or attach it to a process. Run it and there you are with third party insight in to your logic. It comes with a command line interface and GUI.

Only downside is that it runs on Itanium binaries only so other users have to wait till it become available for them too.

Happy profiling!!

Friday, August 28, 2009

NUI: Mobile GUI development

I found "nui", a C++ based GUI development framework. You can develop iPhone apps as well. It runs on Linux, Windows, Mac.
http://www.libnui.net/

I also came to know about "enthought" (http://www.enthought.com/). Another GUI development framework. This guy automatically develop the GUI if you provide the class definition.

I've not tried both of these. But they both sound interesting, so thought of sharing :-)

Monday, August 17, 2009

Python questions for interview

Python is a bliss for quick development. It's a mix of C & C++ features and comes in flavor of a script. For a C/C++ developer who happen to work with Python, here are few interview questions that are frequently asked:

1. what all Python can do for you in OOP?
Python fairly supports OOP principles. You can declare and define classes that follow same philosophy of OOP(similar to C++, Java). Besides features like inheritance, polymorphism are also there.

2. Does it support operator overloading?
You can have operator overloading also.

3. What is pickling?
It's an object serialization technique. Very similar to marshaling/un-marshaling that packet data in networks.

4. Does it support function overloading?
No.

5. How do you pass variable number of arguments to a function?
def foo(*pass_many):
...
In the function you should retrieve the arguments as a tuple.

6. Difference between a tuple and list?
You know it. Tuples are immutable objects while list are mutable.

Highly recommended: http://www.learningpython.com/2008/06/21/operator-overload-learn-how-to-change-the-behavior-of-equality-operators/

Tuesday, July 21, 2009

Careful with both hands while using fork!

fork() is one of the most useful feature of C/Linux/UNIX. But it's like a double edged sword, so be careful with fork :-)
Of late, I got stuck in a weird problem with one of the client application(A) that interacts with another application(B). Application A was hanging when used application B; otherwise alone A runs just fine.
Now what to do? We did a thorough examination of both the applications and found that A is waiting on a pipe P. P has its write end with B and A has got the read end. But why this wait? There is no need to keep this pipe open in first place.
So here fork() comes in to picture. Actually A forks B and then B interacts with A. When A fork() B, B gets a copy of all open file descriptors(FD) of A as well. There you go!
After getting these FDs, B does not take care to close them. But A checks if any of its FD is still open. Since the file is open with B, kernel will tell A that some of your files are being accessed. So just wait :( And this wait never ends...
This was it. A simple close() call in B for all FDs worked for us. And B happily got away with A.

A word of advice: Always call exit() from child. exit() does basic clean up and calls _exit() which more work including closing all files open with child.

Just to verify, you can use this test program:

#include "fcntl.h"
#include "stdlib.h"

int main()
{
int fd = -1;
int status;
char buf[512];

fd = open("abc.txt", O_CREAT);

int pid = fork();

if(pid == 0) { // Child
puts("Child says bye");
exit(status);
} else { // Parent
sleep(1);
int ch = read(fd, buf, 16);
printf("\nRead returns %d\n", ch);
exit(status);
}
}

Thursday, July 2, 2009

Lichen: Our first invention

We, a group of three friends have got our first ever invention disclosure published today. It's quite a happy moment for all of us. We started with a small idea that eventually got transformed into a serious paper :)

Title : Lichen: A Framework For Characterizing Sporadic Performance Constrictions In Non-deterministic Applications"

The idea deals with better profiling of application with a special way of sample collection.
Our paper is published in worldwide publication Research Journal. (http://www.researchdisclosure.com)

Wednesday, July 1, 2009

Terminal problem with csh/ksh

If you are a bash addict and you have to work in csh/ksh; you may face terminal problem with csh/ksh while using vi/more or any such command. vi expects complete knowledge of the terminal where it's going to display its contents. There is a simple way to fix it.

1. First get the shell type by: $echo $SHELL
This is just to confirm that you are in csh/ksh.

2. Figure out what is the terminal type. If you do not know then ask your system administrator.

3. If you can't, that's also okay. :-) We'll try to use default terminal emulator type called "vt100".

4. Now you need to export the terminal type to your csh/ksh.
$setenv TERM vt100
$tset

5. There you go. Try to open a vi session.

Let me know if you face any issue.

Wednesday, June 3, 2009

Installation: Fedora core 9

- Installing printer on FC9
Use 'system-config-printer' command. For more details, refer following link.
http://foo2zjs.rkkda.com/fedora/hp1020.html

How to install Flashplayer in Firefox on Linux?

Firefox on Linux needs manual installation of Flash player. Surprisingly it's bit messy, so here are a few steps to accomplish this task:

- Get the rpm/deb package downloaded on your system(typically titled adobe-release-i386-1.0-1.noarch.rpm/ flash-plugin-10.0.22.87-release.i386.rpm).

- Install with YUM/aptget like follows:
$yum install adobe-release-i386-1.0-1.noarch.rpm

- Open Firefox browser, and in the address bar type "about:config".
If you see flash player entry in there, you are fine. Else, go to next step.

- Run $updatedb
It'll update the database for "locate" command.

- Run $locate libflashplayer
Most probably it will tell you the directory, "/usr/lib/flash-plugin" directory. Here you will find libflashplayer.so.

- Copy "libflashplayer.so" to Firefox installation directory (e.g. "/usr/local/bin/firefox/plugins/"). Also copy it to "/usr/lib". This will make this library available to all users.

- Again check your Firefox browser with "about:config". You should see a couple of flash player entries there.

That's all. Try running Youtube for verification:-)

Ref: http://www.linuxquestions.org/questions/linux-software-2/how-to-install-flashplayer-in-firefox-487406/

Friday, May 29, 2009

Windows flavor in Linux, YUM

Linux is bad for naive users especially while installing new software. You always see some dependency missing and all. That's what I used to think. But no more. Linux has given a break from messy-dirty way of installing a software to a very very clean and simple way.

Yes, it's by Fedora 9(New name for good old Redhat), and it's called YUM.
Yum makes installing new RPM(Redhat Packet Manager) butter smooth and a slightly simpler than Windows.
All Yum demands is an Internet connection and lo, you are done.

Just download your desired RPM and specify it to yum like this:

$yum install my.rpm

After asking you for a 'YES', it will take care of everything.

Simple right!

Thursday, May 21, 2009

Idiotic scanf() : Scanning whitespaces in a string

In C, a few functions are real nasty and scanf() is one of them. It's specially bad for inputting strings.
What will happen if you give a string:

-> scanf("%s", string)

-> Input: "C is stupid".

Well, you will have only "C" in the "string".

Reason: It happens because scanf() considers white-spaces as delimiter. So "C" being followed by a white space is considered as the only input. Simple, isn't it?

Remedy: We have something like follows:

int main()
{
char arr[10];
scanf("%*[ \t\n]%s", arr);
puts(arr);
}

IT DOESN'T WORK!!!

Only way is this:

#define SIZE 128

int main() {
char arr[SIZE];
char ch;
int index = 0;

while(ch != '\n' && index < SIZE)
{
ch = getchar();
arr[index] = ch;
index++;
}
arr[index] = '\0';
}

Done!

Wednesday, April 15, 2009

GNU gdb: The toolbox of commands

Many a times application throws signals during its execution. By default gdb has some setting for all UNIX defined signals. If this default is "stop" the application, it becomes quite irritating.
To handle a signal: (gdb) handle SIGUSR1 nostop
All available options are:
nostop
GDB should not stop your program when this signal happens. It may still print a message telling you that the signal has come in.

stop
GDB should stop your program when this signal happens. This implies the print keyword as well.

pass
noignore
GDB should allow your program to see this signal; your program can handle the signal, or else it may terminate if the signal is fatal and not handled. pass and noignore are synonyms.

nopass
ignore
GDB should not allow your program to see this signal. nopass and ignore are synonyms.

(Ref: http://sources.redhat.com/gdb/current/onlinedocs/gdb_6.html#SEC44)

Ruby on Rails

Bored of J2EE? I came across a new web app development framework called Ruby on Rails. Based on Agile development philosophy, it's one of the most popular web framework today.

As name suggests, it let's you develop your web application faster and conveniently. All you need to know Ruby before getting started.

Check out more at: http://rubyonrails.org/

Tuesday, March 17, 2009

Linux Scheduling: A Few Facts

- Linux scheduler favors I/O bound processes. It uses dynamic priorities to schedule processes. So a process that has not got CPU for a long time, would get its priority increased and vice verse.

- Processes are moved to different queues and al processes on ready queue are assigned an 'epoch'. The epoch is relevant for processes in ready queue only.

- Now each process is assigned a quantum which is the CPU time allotted to a process. If a process is blocked, it does not use its quantum and unused quantum is carry forward to next epoch. An epoch completes as soon as all processes in ready queue complete their quantum.

- Dynamic priority("goodness") of a process is calculated by base priority and quantum. Hence a I/O bound process which is blocked for a long time, gets its priority improved every time it saves its quantum while it was blocked.

Tuesday, February 17, 2009

Linux Tips

- Changing Linux root password
http://www.paulspoerry.com/2008/09/06/replace-linux-root-password/

- Ways of saying 'Hello! World' : http://www.fitzrovian.com/hello.html
- sizeof (type) requires parenthese, while sizeof expression
does not.

- How to easily read a declaration from left to right:
transform function argument types from inside out first
move the base type to the end
add outer parentheses if there's an initial *
change every (*...) to ... ->
one -> for each *
move qualifiers, so * const becomes const ->

Example: const int *(**const x [])()

*(**const x [])() const int base type to end
(*(**const x [])()) const int add outer parens
(**const x [])() -> const int remove outer ()
x [] const -> -> () -> const int remove inner ()

array of constant pointers to pointers to functions
returning pointers to constant ints

- p + 1 == &p [1]

-

Tuesday, February 10, 2009

Unfolding 2-D array in C

Last weekend my friends and I had a discussion on 2-D array in C. Surprisingly it went long as everyone had some points and knowledge to share :) Finally we could gather some important information and understood this pearl of C better.

I'll clear this idea with C code snippets.

// Declare an array
int arr[2][3]= {1,2,3,4,5,6};

It's size is 2*3*sizeof(int) = 24 bytes (Intel x86)

okay, now what will get printed for following statements:
// Say starting address is 1024

o) print &a + 1
o) print a + 1
o) print *a + 1

To answer these questions better, I'll explain what exactly is a 2-D array. When we declare and define a 2-D array, it's like a new user defined type to C. It's a pointer to a memory location that contiguously contains storage of 6 integers. Logically it's like holding many 1-D array in a followed by one-another.

Ans 1: When you say '&a' , it implies address of user defined type and it's 1024. Now incrementing it by 1 means moving the pointer by the size of this type which is 24 bytes. Thus it'll print 1048.

Ans 2: 'a' fetch us address of 1-D array or 0th row of our matrix. a+1 would get us address of next row i.e. 1024 + (3*4) = 1036

Ans 3: '*a' would get you the 0th element of 0th row which is 1. So *a + 1 would get us 2.

Thursday, January 15, 2009

More Linux Tips

- Detailed info about system: Actually these are kernel messages and are stored as logs in /bin/dmesg.
$dmesg

- Determining runlevel {Shows previous & current runlevel}
$runlevel
0->Halt 1->Single User Mode
2-> Multiuser w/o NFS 3-> Full Multiuser 6-> Reboot

To modify init level-> /etc/inittab

- To switch runlevels, use init command.
$init 3

- Difference between Paging and Swapping
Paging refers to movement of pages to frames on the disk. It's a considerably cheaper operation. Swapping means to move entire address space of a process to disk. This is very expensive operation and happens when a process sleeps or when thrashing occurs.

Monday, January 12, 2009

Linux Linker Unveiled: The Secret Services

Linux linker does not seem to be a lead actor in play of application. But it work like the script which is essential to success of the play. Linker(ldd) is a piece of software that is primarily responsible for address patching a.k.a. relocation. Object files bear absolute and relative addresses.

// test.c
int main()
{
foo();
}

Disassembly of section .text:

00000000

:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 ec 08 sub $0x8,%esp
6: 83 e4 f0 and $0xfffffff0,%esp
9: b8 00 00 00 00 mov $0x0,%eax
e: 29 c4 sub %eax,%esp
10: e8 fc ff ff ff call 11
15: c9 leave
16: c3 ret

# An object file always starts on address 0x0000.

You can see that addresses are relative to the module and called function is completely unknown to foo. Linker take these object files, relocate their addresses to loading addresses(virtual addresses) and resolve calls to function by a function offset table.

To get a function offset table:
$objdump -x ./test.o

RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
00000011 R_386_PC32 foo

All functions called from a file are listed in this table and here foo() has got 0x11 offset. Once all object files are there linker patches the offset with virtual address. Though call to foo() has been patched with correct virtual address, offset of the instruction is still the same(main+0x11).

08048344

:
8048344: 55 push %ebp
8048345: 89 e5 mov %esp,%ebp
8048347: 83 ec 08 sub $0x8,%esp
804834a: 83 e4 f0 and $0xfffffff0,%esp
804834d: b8 00 00 00 00 mov $0x0,%eax
8048352: 29 c4 sub %eax,%esp
8048354: e8 03 00 00 00 call 804835c
8048359: c9 leave
804835a: c3 ret
804835b: 90 nop

0804835c :
804835c: 55 push %ebp
804835d: 89 e5 mov %esp,%ebp
804835f: 5d pop %ebp
8048360: c3 ret

Friday, January 9, 2009

Linux Secrets!

1. Linux uses COW scheme with virtual memory management.
2. Threads in Linux can be LinuxThreads, NPTL(Redhat). NPTL is more efficient and from kernel 2.6 onwards it'll be used. Using env variable LD_ASSUME_KERNEL you can decide which thread library to choose.
3. Linux kernel do not discriminate between threads and processes while making scheduling decision.
4. Memory allocation of a process can be seen with 'pmap' command.
5. Linux CPU scheduler is O(1) scheduler i.e. regardless of number of processes, it always take a constant time to select a process.
6. Linux memory management for IA-32 can address only 1GB of physical memory. Beyond this memory has to mapped to the 1GB range hence allocation a page beyond 1GB degrades performance.
7. IA64 can address 64-128GB of memory.
8. Linux allocates most part of the disk's free space to swap. This improves the performance of VMM.
9. It follows Buddy System for page allocation which try to keep memory address contiguous. Have a loot at /proc/buddyinfo.
10. If no free page is available, kswapd kernel thread reclaims free pages. This thread is used by buddy system and follows LRU. Kernel pages are never swapped out of memory.

Thursday, January 8, 2009

Stack Unwinding

A very nice explanation of what stack unwinding is and why is it needed.
http://docs.hp.com/en/B9106-90012/unwind.5.html

Wednesday, January 7, 2009

Linux Memory Management Secrets!

Tips to Improve Dynamic Memory Performance

- Instead of using memset() to initialize malloc()'ed memory, use calloc(). Because when you call memset(), VM system has to map the pages in to memory in order to zero initialize them. It's very expensive and wasteful if you don't intend to use the pages right away.
calloc() reserves the needed address space but does not zero initialize them unless memory is used. Hence it postpones the need to load pages in to memory. It also lets the system initialize pages as they’re used, as opposed to all at once.

- Lazy allocation: A global(normal variable or a buffer) can be replaced with a static and a couple of functions to allow its access.

- memcpy() & memmove() needs both blocks to be memory resident. Use them if size of blocks is small(>16KB), you would be using the blocks right away, s/d blocks are not page aligned, blocks overlap.
But if you intend to postpone the use, you would increasing the working set of the application. For small amount of data, use memcpy().

- To check heap dysfunctional behavior: $ MALLOC_CHECK_=1 ./a.out
It'll give an address related to each violation of dynamic memory routines.

- Electric fence : Works very well with gdb

- Libsafe: for libc routines